Rice metallothionein promoters

ABSTRACT

The present invention provides non-coding regulatory element polynucleotide molecules isolated from the Metallothionein gene of Oryza sativa and useful for expressing transgenes in plants. The invention further discloses compositions, polynucleotide constructs, transformed host cells, transgenic plants and seeds containing the Oryza sativa regulatory polynucleotide sequences, and methods for preparing and using the same.

This application is a Continuation-In-Part of U.S. patent applicationSer. No. 11/595,983, filed 13 Nov. 2006 Nov. 13, 2006, now U.S. Pat. No.7,790,958, which is a Division of U.S. patent application Ser. No.09/815,264 filed on 23 Mar. 2001 Mar. 23, 2001, now U.S. Pat. No.7,365,185, which is a Continuation-in-part of U.S. patent applicationSer. No. 09/702,134 filed on 31 Oct., 2000 Oct. 31, 2000, now abandoned,which itself and is a Continuation-in-part of U.S. patent applicationSer. No. 09/620,392 filed on 19 Jul. 2000 Jul. 19, 2000, now abandoned,which claims the benefit of U.S. provisional patent application Ser. No.60/144,351, filed Jul. 20, 1999, all of which are herein incorporated byreference in their entireties.

INCORPORATION OF SEQUENCE LISTING

A sequence listing containing the file named pa_01301.txtMONS203USRE.txt, which is 32,247 32,282 bytes (as measured in MICROSOFTWINDOWS®) and created on Aug. 2, 2007 Oct. 3, 2014, is hereinincorporated by reference in its entirety.

FIELD OF THE INVENTION

The present invention relates to the field of plant molecular biologyand plant genetic engineering, and polynucleotide molecules useful forgene expression in plants. Specifically, the present invention disclosesnucleic acid sequences from Oryza sativa (rice) comprising regulatoryelements, such as promoters, leaders and introns, identified from themetallothionein (MTH) gene. The invention further discloses constructs,cells and plants comprising said regulatory elements, and methods ofproducing and using the same.

BACKGROUND

One of the goals of plant genetic engineering is to produce plants withagronomically desirable characteristics or traits. The proper expressionof a desirable transgene in a transgenic plant is one way to achievethis goal. Elements having gene regulatory activity, i.e. regulatoryelements such as promoters, leaders, introns and transcriptiontermination regions, are non-coding polynucleotide molecules which playan integral part in the overall expression of genes in living cells.Isolated regulatory elements that function in plants are thereforeuseful for modifying plant phenotypes through the methods of geneticengineering.

Many regulatory elements are available and are useful for providing goodoverall gene expression. For example, constitutive promoters such asP-FMV, the promoter from the 35S transcript of the Figwort mosaic virus(U.S. Pat. No. 6,051,753); P-CaMV 35S, the promoter from the 35S RNAtranscript of the Cauliflower mosaic virus (U.S. Pat. No. 5,530,196);P-Corn Actin 1, the promoter from the actin 1 gene of Oryza sativa (U.S.Pat. No. 5,641,876); and P-NO:S, the promoter from the nopaline synthasegene of Agrobacterium tumefaciens are known to provide some level ofgene expression in most or all of the tissues of a plant during most orall of the plant's lifespan. While previous work has provided a numberof regulatory elements useful to affect gene expression in transgenicplants, there is still a great need for novel regulatory elements withbeneficial expression characteristics. Many previously identifiedregulatory elements fail to provide the patterns or levels of expressionrequired to fully realize the benefits of expression of selected genesin transgenic crop plants. One example of this is the need forregulatory elements capable of driving gene expression in differenttypes of tissues.

The genetic enhancement of plants and seeds provides significantbenefits to society. For example, plants and seeds may be enhanced tohave desirable agricultural, biosynthetic, commercial, chemical,insecticidal, industrial, nutritional, or pharmaceutical properties.Despite the availability of many molecular tools, however, the geneticmodification of plants and seeds is often constrained by an insufficientor poorly localized expression of the engineered transgene.

Many intracellular processes may impact overall transgene expression,including transcription, translation, protein assembly and folding,methylation, phosphorylation, transport, and proteolysis. Interventionin one or more of these processes can increase the amount of transgeneexpression in genetically engineered plants and seeds. For example,raising the steady-state level of mRNA in the cytosol often yields anincreased accumulation of transgene expression. Many factors maycontribute to increasing the steady-state level of an mRNA in thecytosol, including the rate of transcription, promoter strength andother regulatory features of the promoter, efficiency of mRNAprocessing, and the overall stability of the mRNA.

Among these factors, the promoter plays a central role. Along thepromoter, the transcription machinery is assembled and transcription isinitiated. This early step is often rate-limiting relative to subsequentstages of protein production. Transcription initiation at the promotermay be regulated in several ways. For example, a promoter may be inducedby the presence of a particular compound or external stimuli, express agene only in a specific tissue, express a gene during a specific stageof development, or constitutively express a gene. Thus, transcription ofa transgene may be regulated by operably linking the coding sequence topromoters with different regulatory characteristics. Accordingly,regulatory elements such as promoters, play a pivotal role in enhancingthe agronomic, pharmaceutical or nutritional value of crops.

At least two types of information are useful in predicting promoterregions within a genomic DNA sequence. First, promoters may beidentified on the basis of their sequence “content”, such astranscription factor binding sites and various known promoter motifs.(Stormo, Genome Research 10: 394-397 (2000)). Such signals may beidentified by computer programs that identify sites associated withpromoters, such as TATA boxes and transcription factor (TF) bindingsites. Second, promoters may be identified on the basis of their“location”, i.e. their proximity to a known or suspected codingsequence. (Stormo, Genome Research 10: 394-397 (2000)). Promoters aretypically contained within a region of DNA extending approximately150-1500 basepairs in the 5′ direction from the start codon of a codingsequence. Thus, promoter regions may be identified by locating the startcodon of a coding sequence, and moving beyond the start codon in the 5′direction to locate the promoter region.

It is of immense social, ecological and economic interests to developplants that have enhanced nutrition, improved resistance to pests, andtolerance to harsh conditions such as drought. Thus, the identificationof new genes, regulatory elements (e.g., promoters), etc. that functionin various types of plants is useful in developing enhanced varieties ofcrops. Clearly, there exists a need in the art for new regulatoryelements, such as promoters, that are capable of expressing heterologousnucleic acid sequences in important crop species. We found that isolatedregulatory elements from the Oryza sativa metallothionein gene,particularly the promoter, leader, and enhancer regulatory elements,provide these enhanced expression patterns for an operably linkedtransgene in a transgenic plant. Promoters that exhibit bothconstitutive expression and tissue-specific patterns are of greatinterest in the development of plants that exhibit agronomicallydesirable traits.

SUMMARY

The present invention describes the composition and utility fornon-coding regulatory element promoter molecules identified from theOryza sativa (rice) metallothionein, also known as MTH.

The present invention includes and provides a substantially purifiednucleic acid molecule, or a DNA construct useful for modulating geneexpression in plant cells, or a transgenic plant cell, or a transgenicplant, or a fertile transgenic plant, or a seed of a fertile transgenicplant, comprising a nucleic acid sequence wherein the nucleic acidsequence: i) hybridizes under stringent conditions with a sequenceelected from the group consisting of SEQ ID NO: 1 through SEQ ID NO: 18or any complements thereof, or any fragments thereof, or any ciselements thereof, or ii) exhibits an 85% or greater identity to asequence elected from the group consisting of SEQ ID NO: 1 through SEQID NO: 18, or any complements thereof, or any fragments thereof, or anycis elements thereof.

The present invention includes and provides a method of transforming ahost cell comprising: a) providing a nucleic acid molecule thatcomprises in the 5′ to 3′ direction: a nucleic acid sequence that: i)hybridizing under stringent conditions with a sequence selected from thegroup consisting of SEQ ID NO: 1 through SEQ ID NO: 18, or anycomplements thereof, or any fragments thereof, or any cis elementsthereof, or ii) exhibiting an 85% or greater identity to a sequenceselected from the group consisting SEQ ID NO: 1 through SEQ ID NO: 18, orany complements thereof, or any fragments thereof, or any cis elementsthereof, operably linked to a transcribable polynucleotide moleculesequence; and b) transforming said plant with the nucleic acid molecule.

In one embodiment, the invention provides regulatory elements isolatedfrom Oryza sativa and useful for modulating gene expression intransgenic plants In another embodiment, the invention provides DNAconstructs containing polynucleotide molecules useful for modulatinggene expression in plants. In another embodiment, the invention providestransgenic plants and seeds comprising the DNA constructs, comprising apromoter or other regulatory elements operably linked to a heterologousDNA molecule, useful for modulating gene expression in plants. Thetransgenic plant preferably expresses an agronomically desirablephenotype.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1: pMON84008, comprising the rice metallothionein promoterP-Os.Metallothionein -b-1:1:2 (SEQ ID NO: 11)

FIG. 2: pMON94302, comprising the rice metallothionein promoterP-Os.Metallothionein -a-1:1:7 (SEQ ID NO: 16).

DETAILED DESCRIPTION OF THE INVENTION

The invention disclosed herein provides polynucleotide molecules havinggene regulatory activity identified from the metallothionein (MTH) geneof Oryza sativa. The design, construction, and use of thesepolynucleotide molecules are one object of this invention. Thepolynucleotide sequences of these polynucleotide molecules are providedas SEQ ID NO: 1 through SEQ ID NO: 18. These polynucleotide moleculesare capable of affecting the expression of an operably linkedtranscribable polynucleotide molecule in plant tissues and therefore canselectively regulate gene expression in transgenic plants. The presentinvention also provides methods of modifying, producing, and using thesame. The invention also includes compositions, transformed host cells,transgenic plants, and seeds containing the promoters, and methods forpreparing and using the same.

Polynucleotide Molecules

Many types of regulatory sequences control gene expression. Not allgenes are turned on at all times during the life cycle of a plant.Different genes are required for the completion of different steps inthe developmental and sexual maturation of the plant. Two general typesof control can be described: temporal regulation, in which a gene isonly expressed at a specific time in development (for example, duringflowering), and spatial regulation, in which a gene is only expressed ina specific location in the plant (for example, seed storage proteins).Many genes, however, may fall into both classes. For example, seedstorage proteins are only expressed in the seed, but they also are onlyexpressed during a short period of time during the development of theseed. Furthermore, because the binding of RNA Polymerase II to thepromoter is the key step in gene expression, it follows that sequencesmay exist in the promoter that control temporal and spatial geneexpression.

The following definitions and methods are provided to better define thepresent invention and to guide those of ordinary skill in the art in thepractice of the present invention. Unless otherwise noted, terms are tobe understood according to conventional usage by those of ordinary skillin the relevant art.

The phrases “coding sequence”, “structural sequence”, and “transcribablepolynucleotide sequence” refer to a physical structure comprising anorderly arrangement of nucleic acids. The nucleic acids are arranged ina series of nucleic acid triplets that each form a codon. Each codonencodes for a specific amino acid. Thus the coding sequence, structuralsequence, and transcribable polynucleotide sequence encode a series ofamino acids forming a protein, polypeptide, or peptide sequence. Thecoding sequence, structural sequence, and transcribable polynucleotidesequence may be contained, without limitation, within a larger nucleicacid molecule, vector, etc. In addition, the orderly arrangement ofnucleic acids in these sequences may be depicted, without limitation, inthe form of a sequence listing, figure, table, electronic medium, etc.

As used herein, the term “polynucleotide molecule” refers to the single-or double-stranded DNA or RNA molecule of genomic or synthetic origin,i.e., a polymer of deoxyribonucleotide or ribonucleotide bases,respectively, read from the 5′ (upstream) end to the 3′ (downstream)end.

As used herein, the term “polynucleotide sequence” refers to thesequence of a polynucleotide molecule. The nomenclature for nucleotidebases as set forth at 37 CFR § 1.822 is used herein.

As used herein, the term “regulatory element” refers to a polynucleotidemolecule having gene regulatory activity, i.e. one that has the abilityto affect the transcription or translation of an operably linkedtranscribable polynucleotide molecule. Regulatory elements such aspromoters, leaders, introns, and transcription termination regions arepolynucleotide molecules having gene regulatory activity which play anintegral part in the overall expression of genes in living cells.Isolated regulatory elements that function in plants are thereforeuseful for modifying plant phenotypes through the methods of geneticengineering. By “regulatory element” it is intended a series ofnucleotides that determines if, when, and at what level a particulargene is expressed. The regulatory DNA sequences specifically interactwith regulatory proteins or other proteins.

As used herein, the term “operably linked” refers to a firstpolynucleotide molecule, such as a promoter, connected with a secondtranscribable polynucleotide molecule, such as a gene of interest, wherethe polynucleotide molecules are so arranged that the firstpolynucleotide molecule affects the function of the secondpolynucleotide molecule. The two polynucleotide molecules may be part ofa single contiguous polynucleotide molecule and may be adjacent. Forexample, a promoter is operably linked to a gene of interest if thepromoter modulates transcription of the gene of interest in a cell.

As used herein, the term “gene regulatory activity” refers to apolynucleotide molecule capable of affecting transcription ortranslation of an operably linked polynucleotide molecule. An isolatedpolynucleotide molecule having gene regulatory activity may providetemporal or spatial expression or modulate levels and rates ofexpression of the operably linked polynucleotide molecule. An isolatedpolynucleotide molecule having gene regulatory activity may comprise apromoter, intron, leader, or 3′ transcriptional termination region.

As used herein, the term “gene expression” or “expression” refers to thetranscription of a DNA molecule into a transcribed RNA molecule. Geneexpression may be described as related to temporal, spatial,developmental, or morphological qualities as well as quantitative orqualitative indications. The transcribed RNA molecule may be translatedto produce a protein molecule or may provide an antisense or otherregulatory RNA molecule.

As used herein, an “expression pattern” is any pattern of differentialgene expression. In a preferred embodiment, an expression pattern isselected from the group consisting of tissue, temporal, spatial,developmental, stress, environmental, physiological, pathological, cellcycle, and chemically responsive expression patterns.

As used herein, an “enhanced expression pattern” is any expressionpattern for which an operably linked nucleic acid sequence is expressedat a level greater than 0.01%; preferably in a range of about 0.5% toabout 20% (w/w) of the total cellular RNA or protein.

As used herein, the term “operably linked” refers to a firstpolynucleotide molecule, such as a promoter, connected with a secondtranscribable polynucleotide molecule, such as a gene of interest, wherethe polynucleotide molecules are so arranged that the firstpolynucleotide molecule affects the function of the secondpolynucleotide molecule. The two polynucleotide molecules may or may notbe part of a single contiguous polynucleotide molecule and may or maynot be adjacent. For example, a promoter is operably linked to a gene ofinterest if the promoter regulates or mediates transcription of the geneof interest in a cell.

As used herein, the term “transcribable polynucleotide molecule” refersto any polynucleotide molecule capable of being transcribed into a RNAmolecule, including but not limited to protein coding sequences (e.g.transgenes) and sequences (e.g. a molecule useful for gene suppression).

The present invention includes a polynucleotide molecule having anucleic acid sequence that hybridizes to SEQ ID NO: 1 through SEQ ID NO:18, or any complements thereof, or any cis elements thereof, or anyfragments thereof. The present invention also provides a nucleic acidmolecule comprising a nucleic acid sequence selected from the groupconsisting of SEQ ID NO: 1 through SEQ ID NO: 18, any complementsthereof, or any cis elements thereof, or any fragments thereof. Thepolynucleotide molecules of the present invention (SEQ ID NO: 1 throughSEQ ID NO: 18) were all isolated or identified from the Oryza sativametallothionein (MTH) gene, and are represented in the polynucleotideconstructs listed in Table 1.

TABLE 1 Sequence Annotations for Polynucleotide Molecules Isolated fromthe MTH gene of Oryza sativa SEQ ID Description 1 51237G_55999 2P-Os.Mth1-1:1:1 3 P-Os.Mth2-1:1:1 4 P-Os.Mth1-1:1:2 5 P-Os.Mth2-1:1:2 6P-Os.Mth1-1:1:3 7 P-Os.Mth2-1:1:5 8 P-Os.Mth-1:1:1 9 P-Os.Mth-1:1:2 10P-Os.Mth-1:1:3 11 P-Os.Metallothionein-b-1:1:2 12P-Os.Metallothionein-a-1:1:1 13 P-Os.Metallothionein-a-1:1:2 14P-Os.Metallothionein-b-1:1:1 15 P-Os.Metallothionein-a-1:1:3 16P-Os.Metallothionein-a-1:1:7 17 P-Os.Metallothionein-b-1:1:3 18P-Os.Metallothionein-b-1:1:4Determination of Sequence Similarity Using Hybridization Techniques

Nucleic acid hybridization is a technique well known to those of skillin the art of DNA manipulation. The hybridization properties of a givenpair of nucleic acids are an indication of their similarity or identity.

The term “hybridization” refers generally to the ability of nucleic acidmolecules to join via complementary base strand pairing. Suchhybridization may occur when nucleic acid molecules are contacted underappropriate conditions. “Specifically hybridizes” refers to the abilityof two nucleic acid molecules to form an anti-parallel, double-strandednucleic acid structure. A nucleic acid molecule is said to be the“complement” of another nucleic acid molecule if they exhibit “completecomplementarity”, i.e., each nucleotide in one sequence is complementaryto its base pairing partner nucleotide in another sequence. Twomolecules are said to be “minimally complementary” if they can hybridizeto one another with sufficient stability to permit them to remainannealed to one another under at least conventional “low-stringency”conditions. Similarly, the molecules are said to be “complementary” ifthey can hybridize to one another with sufficient stability to permitthem to remain annealed to one another under conventional“high-stringency” conditions. Nucleic acid molecules that hybridize toother nucleic acid molecules, e.g., at least under low stringencyconditions are said to be “hybridizable cognates” of the other nucleicacid molecules. Conventional low stringency and high stringencyconditions are described herein and by Sambrook et al., MolecularCloning, A Laboratory Manual, 2nd Ed., Cold Spring Harbor Press, ColdSpring Harbor, N.Y. (1989) and by Haymes et al., Nucleic AcidHybridization, A Practical Approach, IRL Press, Washington, D.C. (1985).Departures from complete complementarity are permissible, as long assuch departures do not completely preclude the capacity of the moleculesto form a double-stranded structure.

Low stringency conditions may be used to select nucleic acid sequenceswith lower sequence identities to a target nucleic acid sequence. Onemay wish to employ conditions such as about 0.15 M to about 0.9 M sodiumchloride, at temperatures ranging from about 20° C. to about 55° C. Highstringency conditions may be used to select for nucleic acid sequenceswith higher degrees of identity to the disclosed nucleic acid sequences(Sambrook et al., 1989). High stringency conditions typically involvenucleic acid hybridization in about 2× to about 10×SSC (diluted from a20×SSC stock solution containing 3 M sodium chloride and 0.3 M sodiumcitrate, pH 7.0 in distilled water), about 2.5× to about 5 ×Denhardt'ssolution (diluted from a 50× stock solution containing 1% (w/v) bovineserum albumin, 1% (w/v) ficoll, and 1% (w/v) polyvinylpyrrolidone indistilled water), about 10 mg/mL to about 100 mg/mL fish sperm DNA, andabout 0.02% (w/v) to about 0.1% (w/v) SDS, with an incubation at about50° C. to about 70° C. for several hours to overnight. High stringencyconditions are preferably provided by 6×SSC, 5×Denhardt's solution, 100mg/mL fish sperm DNA, and 0.1% (w/v) SDS, with an incubation at 55° C.for several hours. Hybridization is generally followed by several washsteps. The wash compositions generally comprise 0.5× to about 10×SSC,and 0.01% (w/v) to about 0.5% (w/v) SDS with a 15 minute incubation atabout 20° C. to about 70° C. Preferably, the nucleic acid segmentsremain hybridized after washing at least one time in 0.1 X SSC at 65° C.

A nucleic acid molecule preferably comprises a nucleic acid sequencethat hybridizes, under low or high stringency conditions, with SEQ IDNO: 1 through SEQ ID NO: 18, any complements thereof, or any fragmentsthereof, or any cis elements thereof A nucleic acid molecule mostpreferably comprises a nucleic acid sequence that hybridizes under highstringency conditions with SEQ ID NO: 1 through SEQ ID NO: 18, anycomplements thereof, or any fragments thereof, or any cis elementsthereof.

Analysis of Sequence Similarity Using Identity Scoring

As used herein “sequence identity” refers to the extent to which twooptimally aligned polynucleotide or peptide sequences are invariantthroughout a window of alignment of components, e.g., nucleotides oramino acids. An “identity fraction” for aligned segments of a testsequence and a reference sequence is the number of identical componentswhich are shared by the two aligned sequences divided by the totalnumber of components in reference sequence segment, i.e., the entirereference sequence or a smaller defined part of the reference sequence.

As used herein, the term “percent sequence identity” or “percentidentity” refers to the percentage of identical nucleotides in a linearpolynucleotide sequence of a reference (“query”) polynucleotide molecule(or its complementary strand) as compared to a test (“subject”)polynucleotide molecule (or its complementary strand) when the twosequences are optimally aligned (with appropriate nucleotide insertions,deletions, or gaps totaling less than 20 percent of the referencesequence over the window of comparison). Optimal alignment of sequencesfor aligning a comparison window are well known to those skilled in theart and may be conducted by tools such as the local homology algorithmof Smith and Waterman, the homology alignment algorithm of Needleman andWunsch, the search for similarity method of Pearson and Lipman, andpreferably by computerized implementations of these algorithms such asGAP, BESTFIT, FASTA, and TFASTA available as part of the GCG® WISCONSINPACKAGE® (Accelrys Inc., Burlington, Mass.). An “identity fraction” foraligned segments of a test sequence and a reference sequence is thenumber of identical components which are shared by the two alignedsequences divided by the total number of components in the referencesequence segment, i.e., the entire reference sequence or a smallerdefined part of the reference sequence. Percent sequence identity isrepresented as the identity fraction multiplied by 100. The comparisonof one or more polynucleotide sequences may be to a full-lengthpolynucleotide sequence or a portion thereof, or to a longerpolynucleotide sequence. For purposes of this invention “percentidentity” may also be determined using BLASTX version 2.0 for translatednucleotide sequences and BLASTN version 2.0 for polynucleotidesequences.

The percent of sequence identity is preferably determined using the“Best Fit” or “Gap” program of the SEQUENCE ANALYSIS SOFTWARE PACKAGE™(Version 10; Genetics Computer Group, Inc., Madison, Wis.). “Gap”utilizes the algorithm of Needleman and Wunsch (Needleman and Wunsch,Journal of Molecular Biology 48:443-453, 1970) to find the alignment oftwo sequences that maximizes the number of matches and minimizes thenumber of gaps. “BestFit” performs an optimal alignment of the bestsegment of similarity between two sequences and inserts gaps to maximizethe number of matches using the local homology algorithm of Smith andWaterman (Smith and Waterman, Advances in Applied Mathematics,2:482-489, 1981, Smith et al., Nucleic Acids Research 11:2205-2220,1983). The percent identity is most preferably determined using the“Best Fit” program.

Useful methods for determining sequence identity are also disclosed inGuide to Huge Computers, Martin J. Bishop, ed., Academic Press, SanDiego, 1994, and Carillo, H., and Lipton, D., Applied Math (1988)48:1073. More particularly, preferred computer programs for determiningsequence identity include the Basic Local Alignment Search Tool (BLAST)programs which are publicly available from National Center BiotechnologyInformation (NCBI) at the National Library of Medicine, NationalInstitute of Health, Bethesda, Md. 20894; see BLAST Manual, Altschul etal, NCBI, NLM, NIH; Altschul et al., J. Mol. Biol. 215:403-410 (1990);version 2.0 or higher of BLAST programs allows the introduction of gaps(deletions and insertions) into alignments; for peptide sequence BLASTXcan be used to determine sequence identity; and, for polynucleotidesequence BLASTN can be used to determine sequence identity.

As used herein, the term “substantial percent sequence identity” refersto a percent sequence identity of at least about 70% sequence identity,at least about 80% sequence identity, at least about 85% identity, atleast about 90% sequence identity, or even greater sequence identity,such as about 98% or about 99% sequence identity. Thus, one embodimentof the invention is a polynucleotide molecule that has at least about70% sequence identity, at least about 80% sequence identity, at leastabout 85% identity, at least about 90% sequence identity, or evengreater sequence identity, such as about 98% or about 99% sequenceidentity with a polynucleotide sequence described herein. Polynucleotidemolecules that are capable of regulating transcription of operablylinked transcribable polynucleotide molecules and have a substantialpercent sequence identity to the polynucleotide sequences of thepolynucleotide molecules provided herein are encompassed within thescope of this invention.

“Homology” refers to the level of similarity between two or more nucleicacid or amino acid sequences in terms of percent of positional identity(i.e., sequence similarity or identity). Homology also refers to theconcept of similar functional properties among different nucleic acidsor proteins.

In an alternative embodiment, the nucleic acid molecule comprises anucleic acid sequence that exhibits 70% or greater identity, and morepreferably at least 80 or greater, 85 or greater, 87 or greater, 88 orgreater, 89 or greater, 90 or greater, 91 or greater, 92 or greater, 93or greater, 94 or greater, 95 or greater, 96 or greater, 97 or greater,98 or greater, or 99% or greater identity to a nucleic acid moleculeselected from the group consisting of SEQ ID NO: 1 through SEQ ID NO:18, any complements thereof, any fragments thereof, or any cis elementsthereof. The nucleic acid molecule preferably comprises a nucleic acidsequence that exhibits a 75% or greater sequence identity with apolynucleotide selected from the group consisting of SEQ ID NO: 1through SEQ ID NO: 18, any complements thereof, any fragments thereof,or any cis elements thereof. The nucleic acid molecule more preferablycomprises a nucleic acid sequence that exhibits an 80% or greatersequence identity with a polynucleotide selected from the groupconsisting of SEQ ID NO: 1 through SEQ ID NO: 18, any complementsthereof, any fragments thereof, or any cis elements thereof. The nucleicacid molecule most preferably comprises a nucleic acid sequence thatexhibits an 85% or greater sequence identity with a polynucleotideselected from the group consisting of SEQ ID NO: 1 through SEQ ID NO:18, any complements thereof, any fragments thereof, or any cis elementsthereof.

For purposes of this invention “percent identity” may also be determinedusing BLASTX version 2.0 for translated nucleotide sequences and BLASTNversion 2.0 for polynucleotide sequences. In a preferred embodiment ofthe present invention, the presently disclosed corn genomic promotersequences comprise nucleic acid molecules or fragments having a BLASTscore of more than 200, preferably a BLAST score of more than 300, andeven more preferably a BLAST score of more than 400 with theirrespective homologues.

Polynucleotide Molecules, Motifs, Fragments, Chimeric Molecules

Nucleic acid molecules of the present invention include nucleic acidsequences that are between about 0.01 Kb and about 50 Kb, morepreferably between about 0.1 Kb and about 25 Kb, even more preferablybetween about 1 Kb and about 10 Kb, and most preferably between about 3Kb and about 10 Kb, about 3 Kb and about 7 Kb, about 4 Kb and about 6Kb, about 2 Kb and about 4 Kb, about 2 Kb and about 5 Kb, about 1 Kb andabout 5 Kb, about 1 Kb and about 3 Kb, or about 1 Kb and about 2 Kb.

As used herein, the term “fragment” or “fragment thereof” refers to afinite polynucleotide sequence length that comprises at least 25, atleast 50, at least 75, at least 85, or at least 95 contiguous nucleotidebases wherein its complete sequence in entirety is identical to acontiguous component of the referenced polynucleotide molecule.

As used herein, the term “chimeric” refers to the product of the fusionof portions of two or more different polynucleotide molecules. As usedherein, the term “chimeric” refers to a gene expression element producedthrough the manipulation of known elements or other polynucleotidemolecules. Novel chimeric regulatory elements can be designed orengineered by a number of methods. In one embodiment of the presentinvention, a chimeric promoter may be produced by fusing an enhancerdomain from a first promoter to a second promoter. The resultantchimeric promoter may have novel expression properties relative to thefirst or second promoters. Novel chimeric promoters can be constructedsuch that the enhancer domain from a first promoter is fused at the 5′end, at the 3′ end, or at any position internal to the second promoter.The location of the enhancer domain fusion relative to the secondpromoter may cause the resultant chimeric promoter to have novelexpression properties relative to a fusion made at a different location.

In another embodiment of the present invention, chimeric molecules maycombine enhancer domains that can confer or modulate gene expressionfrom one or more promoters, by fusing a heterologous enhancer domainfrom a first promoter to a second promoter with its own partial orcomplete regulatory elements. Examples of suitable enhancer domains tobe used in the practice of the present invention include, but are notlimited to the enhancer domains from promoters such as P-FMV, thepromoter from the 35S transcript of the Figwort mosaic virus (describedin U.S. Pat. No. 6,051,753, which is incorporated herein by reference)and P-CaMV 35S, the promoter from the 35S RNA transcript of theCauliflower mosaic virus (described in U.S. Pat. Nos. 5,530,196,5,424,200, and 5,164,316, all of which are incorporated herein byreference). Construction of chimeric promoters using enhancer domains isdescribed in, for example, U.S. Pat. No. 6,660,911, which isincorporated herein by reference. Thus, the design, construction, anduse of chimeric expression elements according to the methods disclosedherein for modulating the expression of operably linked transcribablepolynucleotide molecules are encompassed by the present invention.

The invention disclosed herein provides polynucleotide moleculescomprising regulatory element fragments that may be used in constructingnovel chimeric regulatory elements. Novel combinations comprisingfragments of these polynucleotide molecules and at least one otherregulatory element or fragment can be constructed and tested in plantsand are considered to be within the scope of this invention. Thus, thedesign, construction, and use of chimeric regulatory elements is oneobject of this invention.

Regulatory Elements

Gene expression is finely regulated at both the transcriptional andpost-transcriptional levels. A spectrum of control regions regulatetranscription by RNA polymerase II. Enhancers that can stimulatetranscription from a promoter tens of thousands of base pairs away(e.g., the SV40 enhancer) are an example of long-range effectors,whereas more proximal elements include promoters and introns.Transcription initiates at the cap site encoding the first nucleotide ofthe first exon of an mRNA. For many genes, especially those encodingabundantly expressed proteins, a TATA box located 25-30 base pairsupstream form the cap site directs RNA polymerase II to the start site.Promoter-proximal elements roughly within the first 200 base pairsupstream of the cap site stimulate transcription.

Features of the untranslated regions of mRNAs that control translation,degradation and localization include stem-loop structures, upstreaminitiation codons and open reading frames, internal ribosome entry sitesand various cis-acting elements that are bound by RNA-binding proteins.

The present invention provides the composition and utility of moleculescomprising regulatory element sequences identified from Zea mays. Theseregulatory element sequences may comprise promoters, cis-elements,enhancers, terminators, or introns. regulatory elements may be isolatedor identified from UnTranslated Regions (UTRs) from a particularpolynucleotide sequence. Any of the regulatory elements described hereinmay be present in a recombinant construct of the present invention.

One skilled in the art would know various promoters, introns, enhancers,transit peptides, targeting signal sequences, 5′ and 3′ untranslatedregions (UTRs), as well as other molecules involved in the regulation ofgene expression that are useful in the design of effective plantexpression vectors, such as those disclosed, for example, in U.S. PatentApplication Publication 2003/01403641 (herein incorporated byreference).

UTRs

UTRs are known to play crucial roles in the post-transcriptionalregulation of gene expression, including modulation of the transport ofmRNAs out of the nucleus and of translation efficiency, subcellularlocalization and stability. Regulation by UTRs is mediated in severalways. Nucleotide patterns or motifs located in 5′ UTRs and 3′ UTRs caninteract with specific RNA-binding proteins. Unlike DNA-mediatedregulatory signals, however, whose activity is essentially mediated bytheir primary structure, the biological activity of regulatory motifs atthe RNA level relies on a combination of primary and secondarystructure. Interactions between sequence elements located in the UTRsand specific complementary RNAs have also been shown to play keyregulatory roles.

Finally, there are examples of repetitive elements that are importantfor regulation at the RNA level, affecting translation efficiency. Forexample, non-translated 5′ leader polynucleotide molecules derived fromheat shock protein genes have been demonstrated to enhance geneexpression in plants (see for example, U.S. Pat. Nos. 5,659,122 and5,362,865, all of which are incorporated herein by reference).

Cis-Acting Elements

Many regulatory elements act in cis (“cis elements”) and are believed toaffect DNA topology, producing local conformations that selectivelyallow or restrict access of RNA polymerase to the DNA template or thatfacilitate selective opening of the double helix at the site oftranscriptional initiation. C is elements occur within the 5′ UTRassociated with a particular coding sequence, and are often found withinpromoters and promoter modulating sequences (inducible elements). C iselements can be identified using known cis elements as a target sequenceor target motif in the BLAST programs of the present invention. Examplesof cis-acting elements in the 5′UTR associated with a polynucleotidecoding sequence include, but are not limited to, promoters andenhancers.

Promoters

Among the gene expression regulatory elements, the promoter plays acentral role. Along the promoter, the transcription machinery isassembled and transcription is initiated. This early step is oftenrate-limiting relative to subsequent stages of protein production.Transcription initiation at the promoter may be regulated in severalways. For example, a promoter may be induced by the presence of aparticular compound or external stimuli, express a gene only in aspecific tissue, express a gene during a specific stage of development,or constitutively express a gene. Thus, transcription of a transgene maybe regulated by operably linking the coding sequence to promoters withdifferent regulatory characteristics. Accordingly, regulatory elementssuch as promoters, play a pivotal role in enhancing the agronomic,pharmaceutical or nutritional value of crops.

As used herein, the term “promoter” refers to a polynucleotide moleculethat is involved in recognition and binding of RNA polymerase II andother proteins such as transcription factors (trans-acting proteinfactors that regulate transcription) to initiate transcription of anoperably linked gene. A promoter may be isolated from the 5′untranslated region (5′ UTR) of a genomic copy of a gene. Alternately,promoters may be synthetically produced or manipulated DNA elements.Promoters may be defined by their temporal, spatial, or developmentalexpression pattern. A promoter can be used as a regulatory element formodulating expression of an operably linked transcribable polynucleotidemolecule. Promoters may themselves contain sub-elements such ascis-elements or enhancer domains that effect the transcription ofoperably linked genes. A “plant promoter” is a native or non-nativepromoter that is functional in plant cells. A plant promoter can be usedas a 5′ regulatory element for modulating expression of an operablylinked gene or genes. Plant promoters may be defined by their temporal,spatial, or developmental expression pattern.

Any of the nucleic acid molecules described herein may comprise nucleicacid sequences comprising promoters. Promoters of the present inventioncan include between about 300 bp upstream and about 10 kb upstream ofthe trinucleotide ATG sequence at the start site of a protein codingregion. Promoters of the present invention can preferably includebetween about 300 bp upstream and about 5 kb upstream of thetrinucleotide ATG sequence at the start site of a protein coding region.Promoters of the present invention can more preferably include betweenabout 300 bp upstream and about 2 kb upstream of the trinucleotide ATGsequence at the start site of a protein coding region. Promoters of thepresent invention can include between about 300 bp upstream and about 1kb upstream of the trinucleotide ATG sequence at the start site of aprotein coding region. While in many circumstances a 300 bp promoter maybe sufficient for expression, additional sequences may act to furtherregulate expression, for example, in response to biochemical,developmental or environmental signals.

The promoter of the present invention preferably transcribes aheterologous transcribable polynucleotide sequence at a high level in aplant. More preferably, the promoter hybridizes to a nucleic acidsequence selected from the group consisting of SEQ ID NO: 1 through SEQID NO: 18, or any complements thereof, or any fragments thereof.Suitable hybridization conditions include those described above. Anucleic acid sequence of the promoter preferably hybridizes, under lowor high stringency conditions, with a molecule selected from the groupconsisting of SEQ ID NO: 1 through SEQ ID NO: 18, complements thereof,or any fragment thereof.

In an alternative embodiment, the promoter comprises a nucleic acidsequence that exhibits 85% or greater identity, and more preferably atleast 86 or greater, 87 or greater, 88 or greater, 89 or greater, 90 orgreater, 91 or greater, 92 or greater, 93 or greater, 94 or greater, 95or greater, 96 or greater, 97 or greater, 98 or greater, or 99% orgreater identity to a nucleic acid sequence selected from the groupconsisting of SEQ ID NO: 1 through SEQ ID NO: 18, or complementsthereof. The promoter most preferably comprises a nucleic acid sequenceselected from the group consisting of SEQ ID NOs: through SEQ ID NO: 18,complements thereof, or any fragments thereof.

A promoter comprises promoter fragments that have promoter activity.Promoter fragments may comprise other regulatory elements such asenhancer domains, and may further be useful for constructing chimericmolecules. The identification of the minimum length fragment thatretains promoter activity is well within the skill of the art. Forexample, fragments of the promoters of the present invention comprise atleast about 50, at least about 100, at least about 150, at least about200, at least about 250, at least about 400, at least about 500, or atleast about 750 contiguous nucleotides, up to and including the fulllength of each disclosed SEQ ID.

At least two types of information are useful in predicting promoterregions within a genomic DNA sequence. First, promoters may beidentified on the basis of their sequence “content”, such astranscription factor binding sites and various known promoter motifs.(Stormo, Genome Research 10: 394-397 (2000)). Such signals may beidentified by computer programs that identify sites associated withpromoters, such as TATA boxes and transcription factor (TF) bindingsites. Second, promoters may be identified on the basis of their“location”, i.e. their proximity to a known or suspected codingsequence. (Stormo, Genome Research 10: 394-397 (2000)). Promoters aretypically found within a region of DNA extending approximately 150-1500base pairs in the 5′ direction from the start codon of a codingsequence. Thus, promoter regions may be identified by locating the startcodon of a coding sequence, and moving beyond the start codon in the 5′direction to locate the promoter region.

Promoter sequence may be analyzed for the presence of common promotersequence characteristics, such as a TATA-box and other knowntranscription factor binding site motifs. These motifs are not alwaysfound in every known promoter, nor are they necessary for promoterfunction, but when present, do indicate that a segment of DNA is apromoter sequence.

For identification of the TATA-box, the putative promoter sequencesimmediately upstream of the coding start site of the predicted geneswithin a given sequence size range, as described above, are used. Thetranscription start site and TATA-box (if present) may be predicted withprogram TSSP. TSSP is designed for predicting PolII promoter regions inplants, and is based on the discriminate analysis combingcharacteristics of functional elements of regulatory sequence with theregulatory motifs from Softberry Inc.'s plant RegSite database (SolovyevV.V. (2001) Statistical approaches in Eukaryotic gene prediction. In:Handbook of Statistical genetics (eds. Balding D. et al.), John Wiley &Sons, Ltd., p. 83-127). In the cases that multiple TATA-boxes arepredicted, only the rightmost (closest to the 5′ end) TATA-box is kept.The transcription start sites (TSS) are refined and extended upstream,based on the matches to the database sequences. Promoter sequences withunique TATA-box, as well the TATA-box locations, may be identifiedwithin the promoter sequences.

For identification of other known transcription factor binding motifs(such as a GC-box, CAAT-box, etc.), the promoter sequences immediatelyupstream of the coding start site of the predicted genes within a givensequence size range, as described above, are used. The knowntranscription factor binding motifs (except TATA-box) on the promotersequences are predicted with a proprietary program PromoterScan. Theidentification of such motifs provide important information about thecandidate promoter. For example, some motifs are associated withinformative annotations such as (but not limited to) “light induciblebinding site” or “stress inducible binding motif” and can be used toselect with confidence a promoter that is able to confer lightinducibility or stress inducibility to an operably-linked transgene,respectively.

Putative promoter sequences are also searched with matcorns for the GCbox (factor name: V_GC_01) and CCAAT box (factor name: F_HAP234_01). Thematcorns for the GC box and the CCAAT box are from Transfac. Thealgorithm that is used to annotate promoters searches for matches toboth sequence motifs and matrix motifs. First, individual matches arefound. For sequence motifs, a maximum number of mismatches are allowed.If the code M,R,W,S,Y, or K are listed in the sequence motif (each ofwhich is a degenerate code for 2 nucleotides) ½ mismatch is allowed. Ifthe code B, D, H, or V is listed in the sequence motif (each of which isa degenerate code for 3 nucleotides) ⅓ mismatch is allowed. Appropriatep values may be determined by simulation by generation of a 5 Mb lengthof random DNA with the same dinucleotide frequency as the test set, andfrom this test set the probability of a given matrix score wasdetermined (number of hits/5e7). Once the individual hits are found, theputative promoter sequence is searched for clusters of hits in a 250 bpwindow. The score for a cluster is found by summing the negative naturallog of the p value for each individual hit. Using simulations with 100Mb lengths, the probability of a window having a cluster score greaterthan or equal to the given value is determined. Clusters with a p valuemore significant than p<1 e-6 are reported. Effects of repetitiveelements are screened. For matrix motifs, a p value cutoff is used on amatrix score. The matrix score is determined by adding the path of agiven DNA sequence through a matrix. Appropriate p values are determinedby simulation: 5 Mb lengths of random DNA with the same dinucleotidefrequency as a test set are generated to test individual matrix hits,and 100 Mb lengths are used to test clusters. The probability of a givenmatrix score and the probability scores for clusters are determined, asare the sequence motifs. The usual cutoff for matcorns is 2.5e-4. Noclustering was done for the GC box or CAAT box.

Examples of promoters include: those described in U.S. Pat. No.6,437,217 (maize RS81 promoter), U.S. Pat. No. 5,641,876 (rice actinpromoter), U.S. Pat. No. 6,426,446 (maize RS324 promoter), U.S. Pat. No.6,429,362 (maize PR-1 promoter), U.S. Pat. No. 6,232,526 (maize A3promoter), U.S. Pat. No. 6,177,611 (constitutive maize promoters), U.S.Pat. Nos. 5,322,938, 5,352,605, 5,359,142 and 5,530,196 (35S promoter),U.S. Pat. No. 6,433,252 (maize L3 oleosin promoter, P-Zm.L3), U.S. Pat.No. 6,429,357 (rice actin 2 promoter as well as a rice actin 2 intron),U.S. Pat. No. 5,837,848 (root specific promoter), U.S. Pat. No.6,294,714 (light inducible promoters), U.S. Pat. No. 6,140,078 (saltinducible promoters), U.S. Pat. No. 6,252,138 (pathogen induciblepromoters), U.S. Pat. No. 6,175,060 (phosphorus deficiency induciblepromoters), U.S. Pat. No. 6,635,806 (gama-coixin promoter, P-Cl.Gcx),and U.S. patent application Ser. No. 09/757,089 (maize chloroplastaldolase promoter), all of which are incorporated herein by reference intheir entirety.

Promoters of the present invention include homologues of cis elementsknown to effect gene regulation that show homology with the promotersequences of the present invention. These cis elements include, but arenot limited to, oxygen responsive cis elements (Cowen et al., J. Biol.Chem. 268(36):26904-26910 (1993)), light regulatory elements (Bruce andQuaill, Plant Cell 2 (11):1081-1089 (1990); Bruce et al., EMBO J.10:3015-3024 (1991); Rocholl et al., Plant Sci. 97:189-198 (1994); Blocket al., Proc. Natl. Acad. Sci. USA 87:5387-5391 (1990); Giuliano et al.,Proc. Natl. Acad. Sci. USA 85:7089-7093 (1988); Staiger et al., Proc.Natl. Acad. Sci. USA 86:6930-6934 (1989); Izawa et al., Plant Cell6:1277-1287 (1994); Menkens et al., Trends in Biochemistry 20:506-510(1995); Foster et al., FASEB J. 8:192-200 (1994); Plesse et al., Mol GenGene 254:258-266 (1997); Green et al., EMBO J. 6:2543-2549 (1987);Kuhlemeier et al., Ann. Rev Plant Physiol. 38:221-257 (1987); Villain etal., J. Biol. Chem. 271:32593-32598 (1996); Lam et al., Plant Cell2:857-866 (1990); Gilmartin et al., Plant Cell 2:369-378 (1990); Dattaet al., Plant Cell 1:1069-1077 (1989); Gilmartin et al., Plant Cell2:369-378 (1990); Castresana et al., EMBO J. 7:1929-1936 (1988); Ueda etal., Plant Cell 1:217-227 (1989); Terzaghi et al., Annu. Rev. PlantPhysiol. Plant Mol. Biol. 46:445-474 (1995); Green et al., EMBO J.6:2543-2549 (1987); Villain et al., J. Biol. Chem. 271:32593-32598(1996); Tjaden et al., Plant Cell 6:107-118 (1994); Tjaden et al., PlantPhysiol. 108:1109-1117 (1995); Ngai et al., Plant J. 12:1021-1234(1997); Bruce et al., EMBO J. 10:3015-3024 (1991); Ngai et al., Plant J.12:1021-1034 (1997)), elements responsive to gibberellin, (Muller etal., J. Plant Physiol. 145:606-613 (1995); Croissant et al., PlantScience 116:27-35 (1996); Lohmer et al., EMBO J. 10:617-624 (1991);Rogers et al., Plant Cell 4:1443-1451 (1992); Lanahan et al., Plant Cell4:203-211 (1992); Skriver et al., Proc. Natl. Acad. Sci. USA88:7266-7270 (1991); Gilmartin et al., Plant Cell 2:369-378 (1990);Huang et al., Plant Mol. Biol. 14:655-668 (1990), Gubler et al., PlantCell 7:1879-1891 (1995)), elements responsive to abscisic acid, (Busk etal., Plant Cell 9:2261-2270 (1997); Guiltinan et al., Science250:267-270 (1990); Shen et al., Plant Cell 7:295-307(1995); Shen etal., Plant Cell 8:1107-1119 (1996); Seo et al., Plant Mol. Biol.27:1119-1131 (1995); Marcotte et al., Plant Cell 1:969-976 (1989); Shenet al., Plant Cell 7:295-307 (1995); Iwasaki et al., Mol Gen Genet.247:391-398 (1995); Hattori et al., Genes Dev. 6:609-618 (1992); Thomaset al., Plant Cell 5:1401-1410 (1993)), elements similar to abscisicacid responsive elements, (Ellerstrom et al., Plant Mol. Biol.32:1019-1027 (1996)), auxin responsive elements (Liu et al., Plant Cell6:645-657 (1994); Liu et al., Plant Physiol. 115:397-407 (1997); Kosugiet al., Plant J. 7:877-886 (1995); Kosugi et al., Plant Cell 9:1607-1619(1997); Ballas et al., J. Mol. Biol. 233:580-596 (1993)), a cis elementresponsive to methyl jasmonate treatment (Beaudoin and Rothstein, PlantMol. Biol. 33:835-846 (1997)), a cis element responsive to abscisic acidand stress response (Straub et al., Plant Mol. Biol. 26:617-630 (1994)),ethylene responsive cis elements (Itzhaki et al., Proc. Natl. Acad. Sci.USA 91:8925-8929 (1994); Montgomery et al., Proc. Natl. Acad. Sci. USA90:5939-5943 (1993); Sessa et al., Plant Mol. Biol. 28:145-153 (1995);Shinshi et al., Plant Mol. Biol. 27:923-932 (1995)), salicylic acid cisresponsive elements, (Strange et al., Plant J. 11:1315-1324 (1997); Qinet al., Plant Cell 6:863-874 (1994)), a cis element that responds towater stress and abscisic acid (Lam et al., J. Biol. Chem.266:17131-17135 (1991); Thomas et al., Plant Cell 5:1401-1410 (1993);Pla et al., Plant Mol Biol 21:259-266 (1993)), a cis element essentialfor M phase-specific expression (Ito et al., Plant Cell 10:331-341(1998)), sucrose responsive elements (Huang et al., Plant Mol. Biol.14:655-668 (1990); Hwang et al., Plant Mol Biol 36:331-341 (1998);Grierson et al., Plant J. 5:815-826 (1994)), heat shock responseelements (Pelham et al., Trends Genet. 1:31-35 (1985)), elementsresponsive to auxin and/or salicylic acid and also reported for lightregulation (Lam et al., Proc. Natl. Acad. Sci. USA 86:7890-7897 (1989);Benfey et al., Science 250:959-966 (1990)), elements responsive toethylene and salicylic acid (Ohme-Takagi et al., Plant Mol. Biol.15:941-946 (1990)), elements responsive to wounding and abiotic stress(Loake et al., Proc. Natl. Acad. Sci. USA 89:9230-9234 (1992); Mhiri etal., Plant Mol. Biol. 33:257-266 (1997)), antoxidant response elements(Rushmore et al., J. Biol. Chem. 266:11632-11639; Dalton et al., NucleicAcids Res. 22:5016-5023 (1994)), Sph elements (Suzuki et al., Plant Cell9:799-807 1997)), elicitor responsive elements, (Fukuda et al., PlantMol. Biol. 34:81-87 (1997); Rushton et al., EMBO J. 15:5690-5700(1996)), metal responsive elements (Stuart et al., Nature 317:828-831(1985); Westin et al., EMBO J. 7:3763-3770 (1988); Thiele et al.,Nucleic Acids Res. 20:1183-1191 (1992); Faisst et al., Nucleic AcidsRes. 20:3-26 (1992)), low temperature responsive elements, (Baker etal., Plant Mol. Biol. 24:701-713 (1994); Jiang et al., Plant Mol. Biol.30:679-684 (1996); Nordin et al., Plant Mol. Biol. 21:641-653 (1993);Zhou et al., J. Biol. Chem. 267: 23515-23519 (1992)), drought responsiveelements, (Yamaguchi et al., Plant Cell 6:251-264 (1994); Wang et al.,Plant Mol. Biol. 28:605-617 (1995); Bray E A, Trends in Plant Science2:48-54 (1997)) enhancer elements for glutenin, (Colot et al., EMBO J.6:3559-3564 (1987); Thomas et al., Plant Cell 2:1171-1180 (1990); Kreiset al., Philos. Trans. R. Soc. Lond., B314:355-365 (1986)),light-independent regulatory elements, (Lagrange et al., Plant Cell9:1469-1479 (1997); Villain et al., J. Biol. Chem. 271:32593-32598(1996)), OCS enhancer elements, (Bouchez et al., EMBO J. 8:4197-4204(1989); Foley et al., Plant J. 3:669-679 (1993)), ACGT elements, (Fosteret al., FASEB J. 8:192-200 (1994); Izawa et al., Plant Cell 6:1277-1287(1994); Izawa et al., J. Mol. Biol. 230:1131-1144 (1993)), negative ciselements in plastid related genes, (Zhou et al., J. Biol. Chem.267:23515-23519 (1992); Lagrange et al., Mol. Cell. Biol. 13:2614-2622(1993); Lagrange et al., Plant Cell 9:1469-1479 (1997); Zhou et al., J.Biol. Chem. 267:23515-23519 (1992)), prolamin box elements, (Forde etal., Nucleic Acids Res. 13:7327-7339 (1985); Colot et al., EMBO J.6:3559-3564 (1987); Thomas et al., Plant Cell 2:1171-1180 (1990);Thompson et al., Plant Mol. Biol. 15:755-764 (1990); Vicente et al.,Proc. Natl. Acad. Sci. USA 94:7685-7690 (1997)), elements in enhancersfrom the IgM heavy chain gene (Gillies et al., Cell 33:717-728 (1983);Whittier et al., Nucleic Acids Res. 15:2515-2535 (1987)).

The activity or strength of a promoter may be measured in terms of theamount of mRNA or protein accumulation it specifically produces,relative to the total amount of mRNA or protein. The promoter preferablyexpresses an operably linked nucleic acid sequence at a level greaterthan 0.01%; preferably in a range of about 0.5% to about 20% (w/w) ofthe total cellular RNA or protein.

Alternatively, the activity or strength of a promoter may be expressedrelative to a well-characterized promoter (for which transcriptionalactivity was previously assessed). For example, a less-characterizedpromoter may be operably linked to a reporter sequence (e.g., GUS) andintroduced into a specific cell type. A well-characterized promoter(e.g. the 35S promoter) is similarly prepared and introduced into thesame cellular context. Transcriptional activity of the unknown promoteris determined by comparing the amount of reporter expression, relativeto the well characterized promoter. In one embodiment, the activity ofthe present promoter is as strong as the 35S promoter when compared inthe same cellular context. The cellular context is preferably maize,sorghum, corn, barley, wheat, canola, soybean, or maize; and morepreferably is maize, sorghum, corn, barley, or wheat; and mostpreferably is soybean or maize.

Enhancers

Enhancers, which strongly activate transcription, frequently in aspecific differentiated cell type, are usually 100-200 base pairs long.Although enhancers often lie within a few kilobases of the cap site, insome cases they lie much further upstream or downstream from the capsite or within an intron. Some genes are controlled by more than oneenhancer region, as in the case of the Drosophila even-skipped gene.

As used herein, the term “enhancer domain” refers to a cis-actingtranscriptional regulatory element (cis-element), which confers anaspect of the overall modulation of gene expression. An enhancer domainmay function to bind transcription factors, trans-acting protein factorsthat regulate transcription. Some enhancer domains bind more than onetranscription factor, and transcription factors may interact withdifferent affinities with more than one enhancer domain. Enhancerdomains can be identified by a number of techniques, including deletionanalysis, i.e., deleting one or more nucleotides from the 5′ end orinternal to a promoter; DNA binding protein analysis using DNase Ifootprinting, methylation interference, electrophoresis mobility-shiftassays, in vivo genomic footprinting by ligation-mediated PCR, and otherconventional assays; or by DNA sequence similarity analysis with knowncis-element motifs by conventional DNA sequence comparison methods. Thefine structure of an enhancer domain can be further studied bymutagenesis (or substitution) of one or more nucleotides or by otherconventional methods. Enhancer domains can be obtained by chemicalsynthesis or by isolation from regulatory elements that include suchelements, and they can be synthesized with additional flankingnucleotides that contain useful restriction enzyme sites to facilitatesubsequence manipulation.

Translational enhancers may also be incorporated as part of arecombinant vector. Thus the recombinant vector may preferably containone or more 5′ non-translated leader sequences which serve to enhanceexpression of the nucleic acid sequence. Such enhancer sequences may bedesirable to increase or alter the translational efficiency of theresultant mRNA. Examples of other regulatory element 5′ nucleic acidleader sequences include dSSU 5′, PetHSP70 5′, and GmHSP17.9 5′. Atranslational enhancer sequence derived from the untranslated leadersequence from the mRNA of the coat protein gene of alfalfa mosaic viruscoat protein gene, placed between the promoter and the gene, to increasetranslational efficiency, is described in U.S. Pat. No. 6,037,527,herein incorporated by reference. Thus, the design, construction, anduse of enhancer domains according to the methods disclosed herein formodulating the expression of operably linked transcribablepolynucleotide molecules are encompassed by the present invention.

Any of the polynucleotide molecules of the present invention maycomprise an enhancer.

Leaders

As used herein, the term “leader” refers to a polynucleotide moleculeisolated from the untranslated 5′ region (5′ UTR) of a genomic copy of agene and defined generally as a segment between the transcription startsite (TSS) and the coding sequence start site. Alternately, leaders maybe synthetically produced or manipulated DNA elements. A “plant leader”is a native or non-native leader that is functional in plant cells. Aplant leader can be used as a 5′ regulatory element for modulatingexpression of an operably linked transcribable polynucleotide molecule.

For example, non-translated 5′ leader polynucleotide molecules derivedfrom heat shock protein genes have been demonstrated to enhance geneexpression in plants (see for example, U.S. Pat. Nos. 5,659,122 and5,362,865, all of which are incorporated herein by reference).

Any of the nucleic acid molecules described herein may comprise nucleicacid sequences comprising leaders. A leader of the present inventionpreferably assists in the regulation of transcription of a heterologoustranscribable polynucleotide sequence at a high level in a plant.

Introns

As used herein, the term “intron” refers to a polynucleotide moleculethat may be isolated or identified from the intervening sequence of agenomic copy of a gene and may be defined generally as a region splicedout during mRNA processing prior to translation. Alternately, intronsmay be synthetically produced or manipulated DNA elements. Introns maythemselves contain sub-elements such as cis-elements or enhancer domainsthat effect the transcription of operably linked genes. A “plant intron”is a native or non-native intron that is functional in plant cells. Aplant intron may be used as a regulatory element for modulatingexpression of an operably linked gene or genes. A polynucleotidemolecule sequence in a recombinant construct may comprise introns. Theintrons may be heterologous with respect to the transcribablepolynucleotide molecule sequence.

The transcribable polynucleotide molecule sequence in the recombinantvector may comprise introns. The introns may be heterologous withrespect to the transcribable polynucleotide molecule sequence. Examplesof regulatory element introns include the corn actin intron and the cornHSP70 intron (U.S. Pat. No. 5,859,347, herein incorporated by referencein its entirety).

Any of the molecule of the present invention may comprise introns. Theintron of the present invention preferably affects transcription aheterologous transcribable polynucleotide sequence at a high level in aplant.

Terminators

The 3′ untranslated regions (3′ UTRs) of mRNAs are generated by specificcleavage and polyadenylation. A 3′ polyadenylation region means a DNAmolecule linked to and located downstream of a structural polynucleotidemolecule and includes polynucleotides that provide a polyadenylationsignal and other regulatory signals capable of affecting transcription,mRNA processing or gene expression. PolyA tails are thought to functionin mRNA stability and in initiation of translation.

As used herein, the term “terminator” refers to a polynucleotidesequence that may be isolated or identified from the 3′ untranslatedregion (3′UTR) of a transcribable gene, which functions to signal to RNApolymerase the termination of transcription. The polynucleotidesequences of the present invention may comprise terminator sequences.

Polyadenylation is the non-templated addition of a 50 to 200 nt chain ofpolyadenylic acid (polyA). Cleavage must precede polyadenylation. Thepolyadenylation signal functions in plants to cause the addition ofpolyadenylate nucleotides to the 3′ end of the mRNA precursor. Thepolyadenylation sequence can be derived from the natural gene, from avariety of plant genes, or from Agrobacterium T-DNA genes. Transcriptiontermination often occurs at sites considerably downstream of the sitesthat, after polyadenylation, are the 3′ ends of most eukaryotic mRNAs.

Examples of 3′ UTR regions are the nopaline synthase 3′ region (nos 3′;Fraley, et al., Proc. Natl. Acad. Sci. USA 80: 4803-4807, 1983), wheathsp17 (T-Ta.Hsp17), and T-Ps.RbcS2:E9 (pea rubisco small subunit), thosedisclosed in WO0011200A2 (herein incorporated by reference) and other 3′UTRs known in the art can be tested and used in combination with a DHDPSor AK coding region, herein referred to as T-3′UTR. Another example ofterminator regions is given in U.S. Pat. No. 6,635,806, hereinincorporated by reference.

Any of the polynucleotide molecules of the present invention maycomprise a terminator.

Regulatory Element Isolation and Modification

Any number of methods well known to those skilled in the art can be usedto isolate a polynucleotide molecule, or fragment thereof, disclosed inthe present invention. For example, PCR (polymerase chain reaction)technology can be used to amplify flanking regions from a genomiclibrary of a plant using publicly available sequence information. Anumber of methods are known to those of skill in the art to amplifyunknown polynucleotide molecules adjacent to a core region of knownpolynucleotide sequence. Methods include but are not limited to inversePCR (IPCR), vectorette PCR, Y-shaped PCR, and genome walking approaches.Polynucleotide fragments can also be obtained by other techniques suchas by directly synthesizing the fragment by chemical means, as iscommonly practiced by using an automated oligonucleotide synthesizer.For the present invention, the polynucleotide molecules were isolatedfrom genomic DNA by designing oligonucleotide primers based on availablesequence information and using PCR techniques.

As used herein, the term “isolated polynucleotide molecule” refers to apolynucleotide molecule at least partially separated from othermolecules normally associated with it in its native state. In oneembodiment, the term “isolated” is also used herein in reference to apolynucleotide molecule that is at least partially separated fromnucleic acids which normally flank the polynucleotide in its nativestate. Thus, polynucleotides fused to regulatory or coding sequenceswith which they are not normally associated, for example as the resultof recombinant techniques, are considered isolated herein. Suchmolecules are considered isolated even when present, for example in thechromosome of a host cell, or in a nucleic acid solution. The term“isolated” as used herein is intended to encompass molecules not presentin their native state.

Those of skill in the art are familiar with the standard resourcematerials that describe specific conditions and procedures for theconstruction, manipulation, and isolation of macromolecules (e.g.,polynucleotide molecules, plasmids, etc.), as well as the generation ofrecombinant organisms and the screening and isolation of polynucleotidemolecules.

Short nucleic acid sequences having the ability to specificallyhybridize to complementary nucleic acid sequences may be produced andutilized in the present invention. These short nucleic acid moleculesmay be used as probes to identify the presence of a complementarynucleic acid sequence in a given sample. Thus, by constructing a nucleicacid probe which is complementary to a small portion of a particularnucleic acid sequence, the presence of that nucleic acid sequence may bedetected and assessed. Use of these probes may greatly facilitate theidentification of transgenic plants which contain the presentlydisclosed nucleic acid molecules. The probes may also be used to screencDNA or genomic libraries for additional nucleic acid sequences relatedor sharing homology to the presently disclosed promoters andtranscribable polynucleotide sequences. The short nucleic acid sequencesmay be used as probes and specifically as PCR probes. A PCR probe is anucleic acid molecule capable of initiating a polymerase activity whilein a double-stranded structure with another nucleic acid. Variousmethods for determining the structure of PCR probes and PCR techniquesexist in the art. Computer generated searches using programs such asPrimer3, STSPipeline, or GeneUp (Pesole, et al., BioTechniques25:112-123, 1998), for example, can be used to identify potential PCRprimers.

Alternatively, the short nucleic acid sequences may be used asoligonucleotide primers to amplify or mutate a complementary nucleicacid sequence using PCR technology. These primers may also facilitatethe amplification of related complementary nucleic acid sequences (e.g.related nucleic acid sequences from other species).

The primer or probe is generally complementary to a portion of a nucleicacid sequence that is to be identified, amplified, or mutated. Theprimer or probe should be of sufficient length to form a stable andsequence-specific duplex molecule with its complement. The primer orprobe preferably is about 10 to about 200 nucleotides long, morepreferably is about 10 to about 100 nucleotides long, even morepreferably is about 10 to about 50 nucleotides long, and most preferablyis about 14 to about 30 nucleotides long. The primer or probe may beprepared by direct chemical synthesis, by PCR (See, for example, U.S.Pat. Nos. 4,683,195, and 4,683,202, each of which is herein incorporatedby reference), or by excising the nucleic acid specific fragment from alarger nucleic acid molecule.

Transcribable Polynucleotide Molecules

A regulatory element of the present invention may be operably linked toa transcribable polynucleotide sequence that is heterologous withrespect to the regulatory element. The term “heterologous” refers to therelationship between two or more nucleic acid or protein sequences thatare derived from different sources. For example, a promoter isheterologous with respect to a transcribable polynucleotide sequence ifsuch a combination is not normally found in nature. In addition, aparticular sequence may be “heterologous” with respect to a cell ororganism into which it is inserted (i.e. does not naturally occur inthat particular cell or organism).

The transcribable polynucleotide molecule may generally be any nucleicacid sequence for which an increased level of transcription is desired.Alternatively, the regulatory element and transcribable polynucleotidesequence may be designed to down-regulate a specific nucleic acidsequence. This is typically accomplished by linking the promoter to atranscribable polynucleotide sequence that is oriented in the antisensedirection. One of ordinary skill in the art is familiar with suchantisense technology. Briefly, as the antisense nucleic acid sequence istranscribed, it hybridizes to and sequesters a complimentary nucleicacid sequence inside the cell. This duplex RNA molecule cannot betranslated into a protein by the cell's translational machinery. Anynucleic acid sequence may be negatively regulated in this manner.

A regulatory element of the present invention may also be operablylinked to a modified transcribable polynucleotide molecule that isheterologous with respect to the promoter. The transcribablepolynucleotide molecule may be modified to provide various desirablefeatures. For example, a transcribable polynucleotide molecule may bemodified to increase the content of essential amino acids, enhancetranslation of the amino acid sequence, alter post-translationalmodifications (e.g., phosphorylation sites), transport a translatedproduct to a compartment inside or outside of the cell, improve proteinstability, insert or delete cell signaling motifs, etc.

Due to the degeneracy of the genetic code, different nucleotide codonsmay be used to code for a particular amino acid. A host cell oftendisplays a preferred pattern of codon usage. Transcribablepolynucleotide molecules are preferably constructed to utilize the codonusage pattern of the particular host cell. This generally enhances theexpression of the transcribable polynucleotide sequence in a transformedhost cell. Any of the above described nucleic acid and amino acidsequences may be modified to reflect the preferred codon usage of a hostcell or organism in which they are contained. Modification of atranscribable polynucleotide sequence for optimal codon usage in plantsis described in U.S. Pat. No. 5,689,052, herein incorporated byreference.

Additional variations in the transcribable polynucleotide molecules mayencode proteins having equivalent or superior characteristics whencompared to the proteins from which they are engineered. Mutations mayinclude, but are not limited to, deletions, insertions, truncations,substitutions, fusions, shuffling of motif sequences, and the like.Mutations to a transcribable polynucleotide molecule may be introducedin either a specific or random manner, both of which are well known tothose of skill in the art of molecular biology.

Thus, one embodiment of the invention is a regulatory element such asprovided in SEQ ID NO: 1 through SEQ ID NO: 18, operably linked to atranscribable polynucleotide molecule so as to modulate transcription ofsaid transcribable polynucleotide molecule at a desired level or in adesired tissue or developmental pattern upon introduction of saidconstruct into a plant cell. In one embodiment, the transcribablepolynucleotide molecule comprises a protein-coding region of a gene, andthe regulatory element affects the transcription of a functional mRNAmolecule that is translated and expressed as a protein product. Inanother embodiment, the transcribable polynucleotide molecule comprisesan antisense region of a gene, and the regulatory element affects thetranscription of an antisense RNA molecule or other similar inhibitoryRNA in order to inhibit expression of a specific RNA molecule ofinterest in a target host cell.

Genes of Agronomic Interest

The transcribable polynucleotide molecule preferably encodes apolypeptide that is suitable for incorporation into the diet of a humanor an animal. Specifically, such transcribable polynucleotide moleculescomprise genes of agronomic interest. As used herein, the term “gene ofagronomic interest” refers to a transcribable polynucleotide moleculethat includes but is not limited to a gene that provides a desirablecharacteristic associated with plant morphology, physiology, growth anddevelopment, yield, nutritional enhancement, disease or pest resistance,or environmental or chemical tolerance. Suitable transcribablepolynucleotide molecules include but are not limited to those encoding ayield protein, a stress resistance protein, a developmental controlprotein, a tissue differentiation protein, a meristem protein, anenvironmentally responsive protein, a senescence protein, a hormoneresponsive protein, an abscission protein, a source protein, a sinkprotein, a flower control protein, a seed protein, an herbicideresistance protein, a disease resistance protein, a fatty acidbiosynthetic enzyme, a tocopherol biosynthetic enzyme, an amino acidbiosynthetic enzyme, or an insecticidal protein.

In one embodiment of the invention, a polynucleotide molecule as shownin SEQ ID NO: 1 through SEQ ID NO: 18, or complements thereof, orfragments thereof, or cis elements thereof comprising regulatoryelements is incorporated into a construct such that a polynucleotidemolecule of the present invention is operably linked to a transcribablepolynucleotide molecule that is a gene of agronomic interest.

The expression of a gene of agronomic interest is desirable in order toconfer an agronomically important trait. A gene of agronomic interestthat provides a beneficial agronomic trait to crop plants may be, forexample, including, but not limited to genetic elements comprisingherbicide resistance (U.S. Pat. Nos. 6,803,501; 6,448,476; 6,248,876;6,225,114; 6,107,549; 5,866,775; 5,804,425; 5,633,435; 5,463,175),increased yield (U.S. Pat. RE38,446; U.S. Pat. Nos. 6,716,474;6,663,906; 6,476,295; 6,441,277; 6,423,828; 6,399,330; 6,372,211;6,235,971; 6,222,098; 5,716,837), insect control (U.S. Pat. Nos.6,809,078; 6,713,063; 6,686,452; 6,657,046; 6,645,497; 6,642,030;6,639,054; 6,620,988; 6,593,293; 6,555,655; 6,538,109; 6,537,756;6,521,442; 6,501,009; 6,468,523; 6,326,351; 6,313,378; 6,284,949;6,281,016; 6,248,536; 6,242,241; 6,221,649; 6,177,615; 6,156,573;6,153,814; 6,110,464; 6,093,695; 6,063,756; 6,063,597; 6,023,013;5,959,091; 5,942,664; 5,942,658, 5,880,275; 5,763,245; 5,763,241),fungal disease resistance (U.S. Pat. Nos. 6,653,280; 6,573,361;6,506,962; 6,316,407; 6,215,048; 5,516,671; 5,773,696; 6,121,436;6,316,407; 6,506,962), virus resistance (U.S. Pat. Nos. 6,617,496;6,608,241; 6,015,940; 6,013,864; 5,850,023; 5,304,730), nematoderesistance (U.S. Pat. No. 6,228,992), bacterial disease resistance (U.S.Pat. No. 5,516,671), plant growth and development (U.S. Pat. Nos.6,723,897; 6,518,488), starch production (U.S. Pat. Nos. 6,538,181;6,538,179; 6,538,178; 5,750,876; 6,476,295), modified oils production(U.S. Pat. Nos. 6,444,876; 6,426,447; 6,380,462), high oil production(U.S. Pat. Nos. 6,495,739; 5,608,149; 6,483,008; 6,476,295), modifiedfatty acid content (U.S. Pat. Nos. 6,828,475; 6,822,141; 6,770,465;6,706,950; 6,660,849; 6,596,538; 6,589,767; 6,537,750; 6,489,461;6,459,018), high protein production (U.S. Pat. No. 6,380,466), fruitripening (U.S. Pat. No. 5,512,466), enhanced animal and human nutrition(U.S. Pat. Nos. 6,723,837; 6,653,530; 6,5412,59; 5,985,605; 6,171,640),biopolymers (U.S. Pat. RE37,543; U.S. Pat. Nos. 6,228,623; 5,958,745 andU.S. Patent Publication No. US20030028917), environmental stressresistance (U.S. Pat. No. 6,072,103), pharmaceutical peptides andsecretable peptides (U.S. Pat. Nos. 6,812,379; 6,774,283; 6,140,075;6,080,560), improved processing traits (U.S. Pat. No. 6,476,295),improved digestibility (U.S. Pat. No. 6,531,648) low raffinose (U.S.Pat. No. 6,166,292), industrial enzyme production (U.S. Pat. No.5,543,576), improved flavor (U.S. Pat. No. 6,011,199), nitrogen fixation(U.S. Pat. No. 5,229,114), hybrid seed production (U.S. Pat. No.5,689,041), fiber production (U.S. Pat. Nos. 6,576,818; 6,271,443;5,981,834; 5,869,720) and biofuel production (U.S. Pat. No. 5,998,700).The genetic elements, methods, and transgenes described in the patentslisted above are incorporated herein by reference.

Alternatively, a transcribable polynucleotide molecule can effect theabove mentioned plant characteristic or phenotype by encoding a RNAmolecule that causes the targeted inhibition of expression of anendogenous gene, for example via antisense, inhibitory RNA (RNAi), orcosuppression-mediated mechanisms. The RNA could also be a catalytic RNAmolecule (i.e., a ribozyme) engineered to cleave a desired endogenousmRNA product. Thus, any transcribable polynucleotide molecule thatencodes a transcribed RNA molecule that affects a phenotype ormorphology change of interest may be useful for the practice of thepresent invention.

Selectable Markers

As used herein the term “marker” refers to any transcribablepolynucleotide molecule whose expression, or lack thereof, can bescreened for or scored in some way. Marker genes for use in the practiceof the present invention include, but are not limited to transcribablepolynucleotide molecules encoding β-glucuronidase (GUS described in U.S.Pat. No. 5,599,670, which is incorporated herein by reference), greenfluorescent protein (GFP described in U.S. Pat. Nos. 5,491,084 and6,146,826, all of which are incorporated herein by reference), proteinsthat confer antibiotic resistance, or proteins that confer herbicidetolerance. Marker genes in genetically modified plants are generally oftwo types: genes conferring antibiotic resistance or genes conferringherbicide tolerance.

Useful antibiotic resistance markers, including those encoding proteinsconferring resistance to kanamycin (nptII), hygromycin B (aph IV),streptomycin or spectinomycin (aad, spec/strep) and gentamycin (aac3 andaacC4) are known in the art.

Herbicides for which transgenic plant tolerance has been demonstratedand the method of the present invention can be applied, include but arenot limited to: glyphosate, glufosinate, sulfonylureas, imidazolinones,bromoxynil, delapon, dicamba, cyclohezanedione, protoporphyrionogenoxidase inhibitors, and isoxasflutole herbicides. Polynucleotidemolecules encoding proteins involved in herbicide tolerance are known inthe art, and include, but are not limited to a polynucleotide moleculeencoding 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS described inU.S. Pat. Nos. 5,627,061, 5,633,435, 6,040,497 and in 5,094,945 forglyphosate tolerance, all of which are incorporated herein byreference); polynucleotides encoding a glyphosate oxidoreductase and aglyphosate-N-acetyl transferase (GOX described in U.S. Pat. No.5,463,175 and GAT described in U.S. Patent publication 20030083480,dicamba monooxygenase U.S. Patent publication 20030135879, all of whichare incorporated herein by reference); a polynucleotide moleculeencoding bromoxynil nitrilase (Bxn described in U.S. Pat. No. 4,810,648for Bromoxynil tolerance, which is incorporated herein by reference); apolynucleotide molecule encoding phytoene desaturase (crti) described inMisawa et al, (1993) Plant J. 4:833-840 and Misawa et al, (1994) PlantJ. 6:481-489 for norflurazon tolerance; a polynucleotide moleculeencoding acetohydroxyacid synthase (AHAS, aka ALS) described inSathasiivan et al (1990) Nucl. Acids Res. 18:2188-2193 for tolerance tosulfonylurea herbicides; and the bar gene described in DeBlock, et al(1987) EMBO J. 6:2513-2519 for glufosinate and bialaphos tolerance. Theregulatory elements of the present invention can express transcribablepolynucleotide molecules that encode for phosphinothricinacetyltransferase, glyphosate resistant EPSPS, aminoglycosidephosphotransferase, hydroxyphenyl pyruvate dehydrogenase, hygromycinphosphotransferase, neomycin phosphotransferase, dalapon dehalogenase,bromoxynil resistant nitrilase, anthranilate synthase, glyphosateoxidoreductase and glyphosate-N-acetyl transferase.

Included within the term “selectable markers” are also genes whichencode a secretable marker whose secretion can be detected as a means ofidentifying or selecting for transformed cells. Examples include markersthat encode a secretable antigen that can be identified by antibodyinteraction, or even secretable enzymes which can be detectedcatalytically. Selectable secreted marker proteins fall into a number ofclasses, including small, diffusible proteins which are detectable,(e.g., by ELISA), small active enzymes which are detectable inextracellular solution (e.g., α-amylase, β-lactamase, phosphinothricintransferase), or proteins which are inserted or trapped in the cell wall(such as proteins which include a leader sequence such as that found inthe expression unit of extension or tobacco PR-S). Other possibleselectable marker genes will be apparent to those of skill in the art.

The selectable marker is preferably GUS, green fluorescent protein(GFP), neomycin phosphotransferase II (nptII), luciferase (LUX), anantibiotic resistance coding sequence, or an herbicide (e.g.,glyphosate) resistance coding sequence. The selectable marker is mostpreferably a kanamycin, hygromycin, or herbicide resistance marker.

Constructs and Vectors

The constructs of the present invention are generally double Ti plasmidborder DNA constructs that have the right border (RB or AGRtu.RB) andleft border (LB or AGRtu.LB) regions of the Ti plasmid isolated fromAgrobacterium tumefaciens comprising a T-DNA, that along with transfermolecules provided by the Agrobacterium cells, permit the integration ofthe T-DNA into the genome of a plant cell (see for example U.S. Pat. No.6,603,061, herein incorporated by reference in its entirety). Theconstructs may also contain the plasmid backbone DNA segments thatprovide replication function and antibiotic selection in bacterialcells, for example, an Escherichia coli origin of replication such asori322, a broad host range origin of replication such as oriV or oriRi,and a coding region for a selectable marker such as Spec/Strp thatencodes for Tn7 aminoglycoside adenyltransferase (aadA) conferringresistance to spectinomycin or streptomycin, or a gentamicin (Gm, Gent)selectable marker gene. For plant transformation, the host bacterialstrain is often Agrobacterium tumefaciens AB1, C58, or LBA4404, however,other strains known to those skilled in the art of plant transformationcan function in the present invention.

As used herein, the term “construct” means any recombinantpolynucleotide molecule such as a plasmid, cosmid, virus, autonomouslyreplicating polynucleotide molecule, phage, or linear or circularsingle-stranded or double-stranded DNA or RNA polynucleotide molecule,derived from any source, capable of genomic integration or autonomousreplication, comprising a polynucleotide molecule where one or morepolynucleotide molecule has been linked in a functionally operativemanner, i.e. operably linked. As used herein, the term “vector” meansany recombinant polynucleotide construct that may be used for thepurpose of transformation, i.e. the introduction of heterologous DNAinto a host cell.

Methods are known in the art for assembling and introducing constructsinto a cell in such a manner that the transcribable polynucleotidemolecule is transcribed into a functional mRNA molecule that istranslated and expressed as a protein product. For the practice of thepresent invention, conventional compositions and methods for preparingand using constructs and host cells are well known to one skilled in theart, see for example, Molecular Cloning: A Laboratory Manual, 3rdedition Volumes 1, 2, and 3 (2000) J. F. Sambrook, D. W. Russell, and N.Irwin, Cold Spring Harbor Laboratory Press. Methods for makingrecombinant vectors particularly suited to plant transformation include,without limitation, those described in U.S. Pat. Nos. 4,971,908,4,940,835, 4,769,061 and 4,757,011, all of which are herein incorporatedby reference in their entirety. These type of vectors have also beenreviewed (Rodriguez, et al. Vectors: A Survey of Molecular CloningVectors and Their Uses, Butterworths, Boston, 1988; Glick et al.,Methods in Plant Molecular Biology and Biotechnology, CRC Press, BocaRaton, Fla., 1993). Typical vectors useful for expression of nucleicacids in higher plants are well known in the art and include vectorsderived from the tumor-inducing (Ti) plasmid of Agrobacteriumtumefaciens (Rogers, et al., Meth. In Enzymol, 153: 253-277, 1987).Other recombinant vectors useful for plant transformation, including thepCaMVCN transfer control vector, have also been described (Fromm et al.,Proc. Natl. Acad. Sci. USA, 82(17): 5824-5828, 1985).

Regulatory Elements in the Construct

Various untranslated regulatory sequences may be included in therecombinant vector. Any such regulatory sequences may be provided in arecombinant vector with other regulatory sequences. Such combinationscan be designed or modified to produce desirable regulatory features.Constructs of the present invention would typically comprise one or moregene expression regulatory elements operably linked to a transcribablepolynucleotide molecule operably linked to a 3′ transcriptiontermination polynucleotide molecule.

Constructs of the present invention may also include additional 5′untranslated regions (5′ UTR) of an mRNA polynucleotide molecule or genewhich can play an important role in translation initiation. For example,non-translated 5′ leader polynucleotide molecules derived from heatshock protein genes have been demonstrated to enhance gene expression inplants (see for example, U.S. Pat. Nos. 5,659,122 and 5,362,865, all ofwhich are incorporated herein by reference). These additional upstreamregulatory polynucleotide molecules may be derived from a source that isnative or heterologous with respect to the other elements present on theconstruct.

One or more additional promoters may also be provided in the recombinantvector. These promoters may be operably linked to any of thetranscribable polynucleotide sequences described above. Alternatively,the promoters may be operably linked to other nucleic acid sequences,such as those encoding transit peptides, selectable marker proteins, orantisense sequences. These additional promoters may be selected on thebasis of the cell type into which the vector will be inserted. Promoterswhich function in bacteria, yeast, and plants are all well taught in theart. The additional promoters may also be selected on the basis of theirregulatory features. Examples of such features include enhancement oftranscriptional activity, inducibility, tissue-specificity, anddevelopmental stage-specificity. In plants, promoters that areinducible, of viral or synthetic origin, constitutively active,temporally regulated, and spatially regulated have been described(Poszkowski, et al., EMBO J., 3: 2719, 1989; Odell, et al., Nature,313:810, 1985; Chau et al., Science, 244:174-181. 1989).

Often-used constitutive promoters include the CaMV 35S promoter (Odell,et al., Nature, 313: 810, 1985), the enhanced CaMV 35S promoter, theFigwort Mosaic Virus (FMV) promoter (Richins, et al., Nucleic Acids Res.20: 8451, 1987), the mannopine synthase (mas) promoter, the nopalinesynthase (nos) promoter, and the octopine synthase (ocs) promoter.

Useful inducible promoters include promoters induced by salicylic acidor polyacrylic acids (PR-1; Williams, et al., Biotechnology 10:540-543,1992), induced by application of safeners (substitutedbenzenesulfonamide herbicides; Hershey and Stoner, Plant Mol. Biol. 17:679-690, 1991), heat-shock promoters (Ou-Lee et al., Proc. Natl. Acad.Sci. U.S.A. 83: 6815, 1986; Ainley et al., Plant Mol. Biol. 14: 949,1990), a nitrate-inducible promoter derived from the spinach nitritereductase transcribable polynucleotide sequence (Back et al., Plant Mol.Biol. 17: 9, 1991), hormone-inducible promoters (Yamaguchi-Shinozaki etal., Plant Mol. Biol. 15: 905, 1990), and light-inducible promotersassociated with the small subunit of RuBP carboxylase and LHCP families(Kuhlemeier et al., Plant Cell 1: 471, 1989; Feinbaum et al., Mol. Gen.Genet. 226: 449-456, 1991; Weisshaar, et al., EMBO J. 10: 1777-1786,1991; Lam and Chua, J. Biol. Chem. 266: 17131-17135, 1990; Castresana etal., EMBO J. 7: 1929-1936, 1988; Schulze-Lefert, et al., EMBO J. 8: 651,1989).

Examples of useful tissue-specific, developmentally-regulated promotersinclude the β-conglycinin 7Sα promoter (Doyle et al., J. Biol. Chem.261: 9228-9238, 1986; Slighton and Beachy, Planta 172: 356, 1987), andseed-specific promoters (Knutzon, et al., Proc. Natl. Acad. Sci. U.S.A.89: 2624-2628, 1992; Bustos, et al., EMBO J. 10: 1469-1479, 1991; Lamand Chua, Science 248: 471, 1991). Plant functional promoters useful forpreferential expression in seed plastid include those from plant storageproteins and from proteins involved in fatty acid biosynthesis inoilseeds. Examples of such promoters include the 5′ regulatory regionsfrom such transcribable polynucleotide sequences as napin (Kridl et al.,Seed Sci. Res. 1: 209, 1991), phaseolin, zein, soybean trypsininhibitor, ACP, stearoyl-ACP desaturase, and oleosin. Seed-specificregulation is discussed in EP 0 255 378.

Another exemplary tissue-specific promoter is the lectin promoter, whichis specific for seed tissue. The Lectin protein in soybean seeds isencoded by a single transcribable polynucleotide sequence (Le1) that isonly expressed during seed maturation and accounts for about 2 to about5% of total seed mRNA. The lectin transcribable polynucleotide sequenceand seed-specific promoter have been fully characterized and used todirect seed specific expression in transgenic tobacco plants (Vodkin, etal., Cell, 34: 1023, 1983; Lindstrom, et al., Developmental Genetics,11: 160, 1990).

Particularly preferred additional promoters in the recombinant vectorinclude the nopaline synthase (nos), mannopine synthase (mas), andoctopine synthase (ocs) promoters, which are carried on tumor-inducingplasmids of Agrobacterium tumefaciens; the cauliflower mosaic virus(CaMV) 19S and 35S promoters; the enhanced CaMV 35S promoter; theFigwort Mosaic Virus (FMV) 35S promoter; the light-inducible promoterfrom the small subunit of ribulose-1,5-bisphosphate carboxylase(ssRUBISCO); the EIF-4A promoter from tobacco (Mandel, et al., PlantMol. Biol, 29: 995-1004, 1995); corn sucrose synthetase 1 (Yang, et al.,Proc. Natl. Acad. Sci. USA, 87: 4144-48, 1990); corn alcoholdehydrogenase 1 (Vogel, et al., J. Cell Biochem., (Suppl) 13D: 312,1989); corn light harvesting complex (Simpson, Science, 233: 34, 1986);corn heat shock protein (Odell, et al., Nature, 313: 810, 1985); thechitinase promoter from Arabidopsis (Samac, et al., Plant Cell,3:1063-1072, 1991); the LTP (Lipid Transfer Protein) promoters frombroccoli (Pyee, et al., Plant J., 7: 49-59, 1995); petunia chalconeisomerase (Van Tunen, et al., EMBO J. 7: 1257, 1988); bean glycine richprotein 1 (Keller, et al., EMBO L., 8: 1309-1314, 1989); Potato patatin(Wenzler, et al., Plant Mol. Biol., 12: 41-50, 1989); the ubiquitinpromoter from maize (Christensen et al., Plant Mol. Biol., 18: 675,689,1992); and the actin promoter from corn (McElroy, et al., Plant Cell,2:163-171, 1990).

The additional promoter is preferably seed selective, tissue specific,constitutive, or inducible. The promoter is most preferably the nopalinesynthase (NOS), octopine synthase (OCS), mannopine synthase (MAS),cauliflower mosaic virus 19S and 35S (CaMV19S, CaMV35S), enhanced CaMV(eCaMV), ribulose 1,5-bisphosphate carboxylase (ss-RUBISCO), figwortmosaic virus (FMV), CaMV derived AS4, tobacco RB7, wheat PDX1, tobaccoEIF-4, lectin protein (Le1), or corn RC2 promoter.

Translational enhancers may also be incorporated as part of therecombinant vector. Thus the recombinant vector may preferably containone or more 5′ non-translated leader sequences which serve to enhanceexpression of the nucleic acid sequence. Such enhancer sequences may bedesirable to increase or alter the translational efficiency of theresultant mRNA. Preferred 5′ nucleic acid sequences include dSSU 5′,PetHSP70 5′, and GmHSP17.9 5′.

The recombinant vector may further comprise a nucleic acid sequenceencoding a transit peptide. This peptide may be useful for directing aprotein to the extracellular space, a chloroplast, or to some othercompartment inside or outside of the cell (see, e.g., European PatentApplication Publication Number 0218571, herein incorporated byreference).

The transcribable polynucleotide sequence in the recombinant vector maycomprise introns. The introns may be heterologous with respect to thetranscribable polynucleotide sequence. Preferred introns include thecorn actin intron and the corn HSP70 intron.

In addition, constructs may include additional regulatory polynucleotidemolecules from the 3′-untranslated region (3′ UTR) of plant genes (e.g.,a 3′ UTR to increase mRNA stability of the mRNA, such as the PI-IItermination region of potato or the octopine or nopaline synthase 3′termination regions). A 3′ non-translated region typically provides atranscriptional termination signal, and a polyadenylation signal whichfunctions in plants to cause the addition of adenylate nucleotides tothe 3′ end of the mRNA. These may be obtained from the 3′ regions to thenopaline synthase (nos) coding sequence, the soybean 7Sa storage proteincoding sequence, the albumin coding sequence, and the pea ssRUBISCO E9coding sequence. Particularly preferred 3′ nucleic acid sequencesinclude nos 3′, E9 3′, ADR12 3′, 7Sα 3′, 11S 3′, and albumin 3′.Typically, nucleic acid sequences located a few hundred base pairsdownstream of the polyadenylation site serve to terminate transcription.These regions are required for efficient polyadenylation of transcribedmRNA. These additional downstream regulatory polynucleotide moleculesmay be derived from a source that is native or heterologous with respectto the other elements present on the construct.

Transcribable Polynucleotides in the Construct

The promoter in the recombinant vector is preferably operably linked toa transcribable polynucleotide sequence. Exemplary transcribablepolynucleotide sequences, and modified forms thereof, are described indetail above. The promoter of the present invention may be operablylinked to a transcribable polynucleotide sequence that is heterologouswith respect to the promoter. In one aspect, the transcribablepolynucleotide sequence may generally be any nucleic acid sequence forwhich an increased level of transcription is desired. The transcribablepolynucleotide sequence preferably encodes a polypeptide that issuitable for incorporation into the diet of a human or an animal.Suitable transcribable polynucleotide sequences include those encoding ayield protein, a stress tolerance protein, a developmental controlprotein, a tissue differentiation protein, a meristem protein, anenvironmentally responsive protein, a senescence protein, a hormoneresponsive protein, an abscission protein, a source protein, a sinkprotein, a flower control protein, a seed protein, an herbicideresistance protein, a disease resistance protein, a fatty acidbiosynthetic enzyme, a tocopherol biosynthetic enzyme, an amino acidbiosynthetic enzyme, or an insecticidal protein.

Alternatively, the promoter and transcribable polynucleotide sequencemay be designed to down-regulate a specific nucleic acid sequence. Thisis typically accomplished by linking the promoter to a transcribablepolynucleotide sequence that is oriented in the antisense direction. Oneof ordinary skill in the art is familiar with such antisense technology.Using such an approach, a cellular nucleic acid sequence is effectivelydown regulated as the subsequent steps of translation are disrupted.Nucleic acid sequences may be negatively regulated in this manner.

Methods are known in the art for constructing and introducing constructsinto a cell in such a manner that the transcribable polynucleotidemolecule is transcribed into a molecule that is capable of causing genesuppression. For example, posttranscriptional gene suppression using aconstruct with an anti-sense oriented transcribable polynucleotidemolecule to regulate gene expression in plant cells is disclosed in U.S.Pat. Nos. 5,107,065 and 5,759,829; post-transcriptional gene suppressionusing a construct with a sense-oriented transcribable polynucleotidemolecule to regulate gene expression in plants is disclosed in U.S. Pat.Nos. 5,283,184 and 5,231,020, all of which are hereby incorporated byreference.

Thus, one embodiment of the invention is a construct comprising aregulatory element such as provided in SEQ ID NO: 1 through SEQ ID NO:18, operably linked to a transcribable polynucleotide molecule so as tomodulate transcription of said transcribable polynucleotide molecule ata desired level or in a desired tissue or developmental pattern uponintroduction of said construct into a plant cell. In one embodiment, thetranscribable polynucleotide molecule comprises a protein-coding regionof a gene, and the regulatory element affects the transcription of afunctional mRNA molecule that is translated and expressed as a proteinproduct. In another embodiment, the transcribable polynucleotidemolecule comprises an antisense region of a gene, and the regulatoryelement affects the transcription of an antisense RNA molecule or othersimilar inhibitory RNA in order to inhibit expression of a specific RNAmolecule of interest in a target host cell.

Exemplary transcribable polynucleotide molecules for incorporation intoconstructs of the present invention include, for example, polynucleotidemolecules or genes from a species other than the target species or genesthat originate with or are present in the same species, but areincorporated into recipient cells by genetic engineering methods ratherthan classical reproduction or breeding techniques. The type ofpolynucleotide molecule can include but is not limited to apolynucleotide molecule that is already present in the plant cell, apolynucleotide molecule from another plant, a polynucleotide moleculefrom a different organism, or a polynucleotide molecule generatedexternally, such as a polynucleotide molecule containing an antisensemessage of a gene, or a polynucleotide molecule encoding an artificial,synthetic, or otherwise modified version of a transgene.

The constructs of this invention comprising a regulatory elementidentified or isolated from Zea mays may further comprise one or moretranscribable polynucleotide molecules. In one embodiment of theinvention, a polynucleotide molecule as shown in SEQ ID NO: 1 throughSEQ ID NO: 8, or any complements thereof, or any fragments thereof,comprising regulatory elements such as promoters, leaders and chimericregulatory elements, is incorporated into a construct such that apolynucleotide molecule of the present invention is operably linked to atranscribable polynucleotide molecule that is a selectable marker or agene of agronomic interest.

The gene regulatory elements of the present invention can beincorporated into a construct using selectable markers and tested intransient or stable plant analyses to provide an indication of theregulatory element's gene expression pattern in stable transgenicplants. Current methods of generating transgenic plants employ aselectable marker gene which is transferred along with any other genesof interest usually on the same DNA molecule. The presence of a suitablemarker is necessary to facilitate the detection of genetically modifiedplant tissue during development.

Thus, in one embodiment of the invention, a polynucleotide molecule ofthe present invention as shown in SEQ ID NO: 1 through SEQ ID NO: 18, orfragments thereof, or complements thereof, or cis elements thereof isincorporated into a polynucleotide construct such that a polynucleotidemolecule of the present invention is operably linked to a transcribablepolynucleotide molecule that provides for a selectable, screenable, orscorable marker. The constructs containing the regulatory elementsoperably linked to a marker gene may be delivered to the tissues and thetissues analyzed by the appropriate mechanism, depending on the marker.The quantitative or qualitative analyses are used as a tool to evaluatethe potential expression profile of a regulatory element whenoperatively linked to a gene of agronomic interest in stable plants. Anymarker gene, described above, may be used in a transient assay.

Methods of testing for marker gene expression in transient assays areknown to those of skill in the art. Transient expression of marker geneshas been reported using a variety of plants, tissues, and DNA deliverysystems. For example, types of transient analyses can include but arenot limited to direct gene delivery via electroporation or particlebombardment of tissues in any transient plant assay using any plantspecies of interest. Such transient systems would include but are notlimited to electroporation of protoplasts from a variety of tissuesources or particle bombardment of specific tissues of interest. Thepresent invention encompasses the use of any transient expression systemto evaluate regulatory elements operably linked to any transcribablepolynucleotide molecule, including but not limited to marker genes orgenes of agronomic interest. Examples of plant tissues envisioned totest in transients via an appropriate delivery system would include butare not limited to leaf base tissues, callus, cotyledons, roots,endosperm, embryos, floral tissue, pollen, and epidermal tissue.

Transformation

The invention is also directed to a method of producing transformedcells and plants which comprise, in a 5′ to 3′ orientation, a geneexpression regulatory element operably linked to a heterologoustranscribable polynucleotide sequence. Other sequences may also beintroduced into the cell, including 3′ transcriptional terminators, 3′polyadenylation signals, other translated or untranslated sequences,transit or targeting sequences, selectable markers, enhancers, andoperators.

The term “transformation” refers to the introduction of nucleic acidinto a recipient host. The term “host” refers to bacteria cells, fungi,animals and animal cells, plants and plant cells, or any plant parts ortissues including protoplasts, calli, roots, tubers, seeds, stems,leaves, seedlings, embryos, and pollen. As used herein, the term“transformed” refers to a cell, tissue, organ, or organism into whichhas been introduced a foreign polynucleotide molecule, such as aconstruct. The introduced polynucleotide molecule may be integrated intothe genomic DNA of the recipient cell, tissue, organ, or organism suchthat the introduced polynucleotide molecule is inherited by subsequentprogeny. A “transgenic” or “transformed” cell or organism also includesprogeny of the cell or organism and progeny produced from a breedingprogram employing such a transgenic plant as a parent in a cross andexhibiting an altered phenotype resulting from the presence of a foreignpolynucleotide molecule. The term “transgenic” refers to an animal,plant, or other organism containing one or more heterologous nucleicacid sequences.

There are many methods for introducing nucleic acids into plant cells.The method generally comprises the steps of selecting a suitable hostcell, transforming the host cell with a recombinant vector, andobtaining the transformed host cell. Suitable methods include bacterialinfection (e.g. Agrobacterium), binary bacterial artificial chromosomevectors, direct delivery of DNA (e.g. via PEG-mediated transformation,desiccation/inhibition-mediated DNA uptake, electroporation, agitationwith silicon carbide fibers, and acceleration of DNA coated particles,etc. (reviewed in Potrykus, et al., Ann. Rev. Plant Physiol. Plant Mol.Biol., 42: 205, 1991).

Technology for introduction of DNA into cells is well known to those ofskill in the art. Methods and materials for transforming plant cells byintroducing a plant polynucleotide construct into a plant genome in thepractice of this invention can include any of the well-known anddemonstrated methods including:

-   -   (1) chemical methods (Graham and Van der Eb, Virology, 54(2):        536-539, 1973; Zatloukal, et al., Ann. N.Y. Acad. Sci., 660:        136-153, 1992);    -   (2) physical methods such as microinjection (Capecchi, Cell,        22(2): 479-488, 1980), electroporation (Wong and Neumann,        Biochim. Biophys. Res. Commun., 107(2): 584-587, 1982; Fromm et        al., Proc. Natl. Acad. Sci. USA, 82(17): 5824-5828, 1985; U.S.        Pat. No. 5,384,253, herein incorporated by reference) particle        acceleration (Johnston and Tang, Methods Cell Biol., 43(A):        353-365, 1994; Fynan et al., Proc. Natl. Acad. Sci. USA, 90(24):        11478-11482, 1993) and microprojectile bombardment (as        illustrated in U.S. Pat. Nos. 5,015,580; 5,550,318; 5,538,880;        6,160,208; 6,399,861; and 6,403,865, all of which are herein        incorporated by reference);    -   (3) viral vectors (Clapp, Clin. Perinatol., 20(1): 155-168,        1993; Lu, et al., J. Exp. Med., 178(6): 2089-2096, 1993; Eglitis        and Anderson, Biotechniques, 6(7): 608-614, 1988);    -   (4) receptor-mediated mechanisms (Curiel et al., Hum. Gen.        Ther., 3(2):147-154, 1992; Wagner, et al., Proc. Natl. Acad.        Sci. USA, 89(13): 6099-6103, 1992), and    -   (5) bacterial mediated mechanisms such as Agrobacterium-mediated        transformation (as illustrated in U.S. Pat. Nos. 5,824,877;        5,591,616; 5,981,840; and 6,384,301, all of which are herein        incorporated by reference);    -   (6) Nucleic acids can be directly introduced into pollen by        directly injecting a plant's reproductive organs (Zhou, et al.,        Methods in Enzymology, 101: 433, 1983; Hess, Intern Rev. Cytol.,        107: 367, 1987; Luo, et al., Plant Mol. Biol. Reporter, 6: 165,        1988; Pena, et al., Nature, 325: 274, 1987).    -   (7) Protoplast transformation, as illustrated in U.S. Pat. No.        5,508,184 (herein incorporated by reference).    -   (8) The nucleic acids may also be injected into immature embryos        (Neuhaus, et al., Theor. Appl. Genet., 75: 30, 1987).

Any of the above described methods may be utilized to transform a hostcell with one or more gene regulatory elements of the present inventionand one or more transcribable polynucleotide molecules. Host cells maybe any cell or organism such as a plant cell, algae cell, algae, fungalcell, fungi, bacterial cell, or insect cell. Preferred hosts andtransformants include cells from: plants, Aspergillus, yeasts, insects,bacteria and algae.

The prokaryotic transformed cell or organism is preferably a bacterialcell, even more preferably an Agrobacterium, Bacillus, Escherichia,Pseudomonas cell, and most preferably is an Escherichia coli cell.Alternatively, the transformed organism is preferably a yeast or fungalcell. The yeast cell is preferably a Saccharomyces cerevisiae,Schizosaccharomyces pombe, or Pichia pastoris. Methods to transform suchcells or organisms are known in the art (EP 0238023; Yelton et al, Proc.Natl. Acad. Sci. (U.S.A.), 81:1470-1474 (1984); Malardier et al, Gene,78:147-156 (1989); Becker and Guarente, In: Abelson and Simon (eds.,),Guide to Yeast Genetics and Molecular Biology, Methods Enzymol., Vol.194, pp. 182-187, Academic Press, Inc., New York; Ito et al, J.Bacteriology, 153:163 (1983); Hinnen et al., Proc. Natl. Acad. Sci.(U.S.A.), 75:1920 (1978); Bennett and LaSure (eds.), More GeneManipulations in Fungi, Academic Press, CA (1991)). Methods to produceproteins of the present invention from such organisms are also known(Kudla et al., EMBO, 9:1355-1364 (1990); Jarai and Buxton, CurrentGenetics, 26:2238-2244 (1994); Verdier, Yeast, 6:271-297 (1990);MacKenzie et al., Journal of Gen. Microbiol., 139: 2295-2307 (1993);Hartl et al., TIBS, 19:20-25 (1994); Bergeron et al., TIBS, 19:124-128(1994); Demolder et al., J. Biotechnology, 32:179-189 (1994); Craig,Science, 260: 1902-1903 (1993); Gething and Sambrook, Nature,355:33-(1992); Puig and Gilbert, J. Biol. Chem., 269:7764-7771 (1994);Wang and Tsou, FASEB Journal, 7:1515-1517 (9193); Robinson et al.,Bio/Technology, 1:381-384 (1994); Enderlin and Ogrydziak, Yeast,10:67-79 (1994); Fuller et al., Proc. Natl. Acad. Sci. (USA),86:1434-1438 (1989); Julius et al., Cell, 37:1075-1089 (1984); Julius etal., Cell, 32:839-852 (1983)).

Another preferred embodiment of the present invention is thetransformation of a plant cell. A plant transformation constructcomprising a regulatory element of the present invention may beintroduced into plants by any plant transformation method.

Methods for transforming dicotyledons, primarily by use of Agrobacteriumtumefaciens and obtaining transgenic plants have been published forcotton (U.S. Pat. Nos. 5,004,863; 5,159,135; 5,518,908, all of which areherein incorporated by reference); soybean (U.S. Pat. Nos. 5,569,834;5,416,011, all of which are herein incorporated by reference; McCabe, etal., Biotechnolgy, 6: 923, 1988; Christou et al., Plant Physiol.87:671-674 (1988)); Brassica (U.S. Pat. No. 5,463,174, hereinincorporated by reference); peanut (Cheng et al., Plant Cell Rep.15:653-657 (1996), McKently et al., Plant Cell Rep. 14:699-703 (1995));papaya; and pea (Grant et al., Plant Cell Rep. 15:254-258 (1995)).

Transformation of monocotyledons using electroporation, particlebombardment and Agrobacterium have also been reported. Transformationand plant regeneration have been achieved in asparagus (Bytebier et al.,Proc. Natl. Acad. Sci. (USA) 84:5354 (1987)); barley (Wan and Lemaux,Plant Physiol 104:37 (1994)); maize (Rhodes et al., Science 240: 204(1988); Gordon-Kamm et al., Plant Cell 2:603-618 (1990); Fromm et al.,Bio/Technology 8:833 (1990); Koziel et al., Bio/Technology 11:194(1993); Armstrong et al., Crop Science 35:550-557 (1995)); oat (Somerset al., Bio/Technology 10:1589 (1992)); orchard grass (Horn et al.,Plant Cell Rep. 7:469 (1988)); corn (Toriyama et al., Theor Appl. Genet.205:34 (1986); Part et al., Plant Mol. Biol. 32:1135-1148 (1996);Abedinia et al., Aust. J. Plant Physiol. 24:133-141 (1997); Zhang andWu, Theor. Appl. Genet. 76:835 (1988); Zhang et al., Plant Cell Rep.7:379 (1988); Battraw and Hall, Plant Sci. 86:191-202 (1992); Christouet al., Bio/Technology 9:957 (1991)); rye (De la Pena et al., Nature325:274 (1987)); sugarcane (Bower and Birch, Plant J. 2:409 (1992));tall fescue (Wang et al., Bio/Technology 10:691 (1992)) and wheat (Vasilet al., Bio/Technology 10:667 (1992); U.S. Pat. No. 5,631,152, hereinincorporated by reference).

The regeneration, development, and cultivation of plants fromtransformed plant protoplast or explants is well taught in the art(Weissbach and Weissbach, Methods for Plant Molecular Biology, (Eds.),Academic Press, Inc., San Diego, Calif., 1988; Horsch et al., Science,227: 1229-1231, 1985). In this method, transformants are generallycultured in the presence of a selective media which selects for thesuccessfully transformed cells and induces the regeneration of plantshoots (Fraley et al., Proc. Natl. Acad. Sci. U.S.A., 80: 4803, 1983).These shoots are typically obtained within two to four months.

The shoots are then transferred to an appropriate root-inducing mediumcontaining the selective agent and an antibiotic to prevent bacterialgrowth. Many of the shoots will develop roots. These are thentransplanted to soil or other media to allow the continued developmentof roots. The method, as outlined, will generally vary depending on theparticular plant strain employed.

The regenerated transgenic plants are self-pollinated to providehomozygous transgenic plants. Alternatively, pollen obtained from theregenerated transgenic plants may be crossed with non-transgenic plants,preferably inbred lines of agronomically important species. Conversely,pollen from non-transgenic plants may be used to pollinate theregenerated transgenic plants.

The transformed plants are analyzed for the presence of the genes ofinterest and the expression level and/or profile conferred by theregulatory elements of the present invention. Those of skill in the artare aware of the numerous methods available for the analysis oftransformed plants. For example, methods for plant analysis include, butare not limited to Southern blots or northern blots, PCR-basedapproaches, biochemical analyses, phenotypic screening methods, fieldevaluations, and immunodiagnostic assays.

The seeds of the plants of this invention can be harvested from fertiletransgenic plants and be used to grow progeny generations of transformedplants of this invention including hybrid plant lines comprising theconstruct of this invention and expressing a gene of agronomic interest.The present invention also provides for parts of the plants of thepresent invention. Plant parts, without limitation, include seed,endosperm, ovule and pollen. In a particularly preferred embodiment ofthe present invention, the plant part is a seed. The invention alsoincludes and provides transformed plant cells which comprise a nucleicacid molecule of the present invention.

The transgenic plant may pass along the transformed nucleic acidsequence to its progeny. The transgenic plant is preferably homozygousfor the transformed nucleic acid sequence and transmits that sequence toall of its offspring upon as a result of sexual reproduction. Progenymay be grown from seeds produced by the transgenic plant. Theseadditional plants may then be self-pollinated to generate a truebreeding line of plants. The progeny from these plants are evaluated,among other things, for gene expression. The gene expression may bedetected by several common methods such as western blotting, northernblotting, immunoprecipitation, and ELISA.

Having now generally described the invention, the same will be morereadily understood through reference to the following examples which areprovided by way of illustration, and are not intended to be limiting ofthe present invention, unless specified.

Each periodical, patent, and other document or reference cited herein isherein incorporated by reference in its entirety.

The following examples are included to demonstrate preferred embodimentsof the invention. It should be appreciated by those of skill in the artthat the techniques disclosed in the examples that follow representtechniques discovered by the inventors to function well in the practiceof the invention. However, those of skill in the art should, in light ofthe present disclosure, appreciate that many changes can be made in thespecific embodiments that are disclosed and still obtain a like orsimilar result without departing from the spirit and scope of theinvention, therefore all matter set forth or shown in the accompanyingdrawings is to be interpreted as illustrative and not in a limitingsense.

EXAMPLES Example 1 Generating a Genomic Bacterial Artificial Chromosome(BAC) Library

BACs are stable, non-chimeric cloning systems having genomic fragmentinserts (100-300 kb) and their DNA can be prepared for most types ofexperiments including DNA sequencing. BAC vector, pBeloBAC11, is derivedfrom the endogenous E. coli F-factor plasmid, which contains genes forstrict copy number control and unidirectional origin of DNA replication.Additionally, pBeloBAC11 has three unique restriction enzyme sites (HindIII, Bam HI and Sph I) located within the LacZ gene which can be used ascloning sites for megabase-size plant DNA. Indigo, another BAC vectorcontains Hind III and Eco R I cloning sites. This vector also contains arandom mutation in the LacZ gene that allows for darker blue colonies.

As an alternative, the P1-derived artificial chromosome (PAC) can beused as a large DNA fragment cloning vector (Ioannou, et al., NatureGenet. 6:84-89 (1994); Suzuki, et al., Gene 199:133-137 (1997)). The PACvector has most of the features of the BAC system, but also containssome of the elements of the bacteriophage P1 cloning system. BAClibraries are generated by ligating size-selected restriction digestedDNA with pBeloBAC11 followed by electroporation into E. coli. BAClibrary construction and characterization is extremely efficient whencompared to YAC (yeast artificial chromosome) library construction andanalysis, particularly because of the chimerism associated with YACs anddifficulties associated with extracting YAC DNA.

There are general methods for preparing megabase-size DNA from plants.For example, the protoplast method yields megabase-size DNA of highquality with minimal breakage. A process involves preparing young leaveswhich are manually feathered with a razor-blade before being incubatedfor four to five hours with cell-wall-degrading enzymes. A second methoddeveloped by Zhange et al., Plant J. 7:175-184 (1995), is a universalnuclei method that works well for several divergent plant taxa. Fresh orfrozen tissue is homogenized with a blender or mortar and pestle. Nucleiare then isolated and embedded. DNA prepared by the nucleic method isoften more concentrated and is reported to contain lower amounts ofchloroplast DNA than the protoplast method.

Once protoplasts or nuclei are produced, they are embedded in an agarosematrix as plugs or microbeads. The agarose provides a support matrix toprevent shearing of the DNA while allowing enzymes and buffers todiffuse into the DNA. The DNA is purified and manipulated in the agaroseand is stable for more than one year at 4° C.

Once high molecular weight DNA is prepared, it is fragmented to thedesired size range. In general, DNA fragmentation utilizes two generalapproaches, 1) physical shearing and 2) partial digestion with arestriction enzyme that cuts relatively frequently within the genome.Since physical shearing is not dependent upon the frequency anddistribution of particular restriction enzymes sites, this method shouldyield the most random distribution of DNA fragments. However, the endsof the sheared DNA fragments must be repaired and cloned directly orrestriction enzyme sites added by the addition of synthetic linkers.Because of the subsequent steps required to clone DNA fragmented byshearing, most protocols fragment DNA by partial restriction enzymedigestion. The advantage of partial restriction enzyme digestion is thatno further enzymatic modification of the ends of the restrictionfragments are necessary. Four common techniques that can be used toachieve reproducible partial digestion of megabase-size DNA are 1)varying the concentration of the restriction enzyme, 2) varying the timeof incubation with the restriction enzyme 3) varying the concentrationof an enzyme cofactor (e.g., Mg2+) and 4) varying the ratio ofendonuclease to methylase.

There are three cloning sites in pBeloBAC11, but only Hind III and BamHI produce 5′ overhangs for easy vector dephosphorylation. These tworestriction enzymes are primarily used to construct BAC libraries. Theoptimal partial digestion conditions for megabase-size DNA aredetermined by wide and narrow window digestions. To optimize the optimumamount of Hind III, 1, 2, 3, 10, and 5-units of enzyme are each added to50 ml aliquots of microbeads and incubated at 37° C. for 20 minutes.

After partial digestion of megabase-size DNA, the DNA is run on apulsed-field gel, and DNA in a size range of 100-500 kb is excised fromthe gel. This DNA is ligated to the BAC vector or subjected to a secondsize selection on a pulsed field gel under different running conditions.Studies have previously reported that two rounds of size selection caneliminate small DNA fragments co-migrating with the selected range inthe first pulse-field fractionation. Such a strategy results in anincrease in insert sizes and a more uniform insert size distribution.

A practical approach to performing size selections is to first test forthe number of clones/microliter of ligation and insert size from thefirst size selected material. If the numbers are good (500 to 2000 whitecolony/microliter of ligation) and the size range is also good (50 to300 kb) then a second size selection is practical. When performing asecond size selection one expects a 80 to 95% decrease in the number ofrecombinant clones per transformation.

Twenty to two hundred nanograms of the size-selected DNA is ligated todephosphorylated BAC vector (molar ratio of 10 to 1 in BAC vectorexcess). Most BAC libraries use a molar ratio of 5 to 15:1 (sizeselected DNA:BAC vector). Transformation is carried out byelectroporation and the transformation efficiency for BACs is about 40to 1,500 transformants from one microliter of ligation product or 20 to1000 transformants/ng DNA.

Several tests can be carried out to determine the quality of a BAClibrary. Three basic tests to evaluate the quality include: the genomecoverage of a BAC library-average insert size, average number of cloneshybridizing with single copy probes and chloroplast DNA content. Thedetermination of the average insert size of the library is assessed intwo ways. First, during library construction every ligation is tested todetermine the average insert size by assaying 20-50 BAC clones perligation. DNA is isolated from recombinant clones using a standard minipreparation protocol, digested with Not I to free the insert from theBAC vector and then sized using pulsed field gel electrophoresis (Maule,Molecular Biotechnology 9:107-126 (1998)).

To determine the genome coverage of the library, it is screened withsingle copy RFLP markers distributed randomly across the genome byhybridization. Microtiter plates containing BAC clones are spotted ontoHybond membranes. Bacteria from 48 or 72 plates are spotted twice ontoone membrane resulting in 18,000 to 27,648 unique clones on eachmembrane in either a 4×4 or 5×5 orientation. Since each clone is presenttwice, false positives are easily eliminated and true positives areeasily recognized and identified.

Finally, the chloroplast DNA content in the BAC library is estimated byhybridizing three chloroplast genes spaced evenly across the chloroplastgenome to the library on high density hybridization filters.

There are strategies for isolating rare sequences within the genome. Forexample, higher plant genomes can range in size from 100 Mb/1C(Arabidopsis) to 15,966 Mb/C (Triticum aestivum), (Arumuganathan andEarle, Plant Mol Bio Rep. 9:208-219 (1991)). The number of clonesrequired to achieve a given probability that any DNA sequence will berepresented in a genomic library is N=(ln(1−P))/(ln(1−L/G)) where N isthe number of clones required, P is the probability desired to get thetarget sequence, L is the length of the average clone insert in basepairs and G is the haploid genome length in base pairs (Clarke et al.,Cell 9:91-100 (1976)). The rice BAC library of the present invention isconstructed in the pBeloBAC11 or similar vector. Inserts are generatedby partial Eco RI or other enzymatic digestion of DNA. The 25× libraryprovides 4-5× coverage sequence from BAC clones across genome.

Example 2 Sequencing Genomic DNA Inserts from a Genomic BAC Library

Two basic methods can be used for DNA sequencing, the chain terminationmethod of Sanger et al., Proc. Natl. Acad. Sci. USA 74:5463-5467 (1977),and the chemical degradation method of Maxam and Gilbert, Proc. Natl.Acad. Sci. USA 74:560-564 (1977). Automation and advances in technologysuch as the replacement of radioisotopes with fluorescence-basedsequencing have reduced the effort required to sequence DNA (Craxton,Methods, 2:20-26 (1991); Ju et al., Proc. Natl. Acad. Sci. USA92:4347-4351 (1995); Tabor and Richardson, Proc. Natl. Acad. Sci. USA92:6339-6343 (1995)). Automated sequencers are available from, forexample, Pharmacia Biotech, Inc., Piscataway, N.J. (Pharmacia ALF),LI-COR, Inc., Lincoln, Nebr. (LI-COR 4,000) and Millipore, Bedford,Mass. (Millipore BaseStation).

In addition, advances in capillary gel electrophoresis have also reducedthe effort required to sequence DNA and such advances provide a rapidhigh resolution approach for sequencing DNA samples (Swerdlow andGesteland, Nucleic Acids Res. 18:1415-1419 (1990); Smith, Nature349:812-813 (1991); Luckey et al., Methods Enzymol. 218:154-172 (1993);Lu et al., J. Chromatog. A. 680:497-501 (1994); Carson et al., Anal.Chem. 65:3219-3226 (1993); Huang et al., Anal. Chem. 64:2149-2154(1992); Kheterpal et al., Electrophoresis 17:1852-1859 (1996); Quesadaand Zhang, Electrophoresis 17:1841-1851 (1996); Baba, Yakugaku Zasshi117: 265-281 (1997)).

A number of sequencing techniques are known in the art, includingfluorescence-based sequencing methodologies. These methods have thedetection, automation and instrumentation capability necessary for theanalysis of large volumes of sequence data. Currently, the 377 DNASequencer (Perkin-Elmer Corp., Applied Biosystems Div., Foster City,Calif.) allows the most rapid electrophoresis and data collection. Withthese types of automated systems, fluorescent dye-labeled sequencereaction products are detected and data entered directly into thecomputer, producing a chromatogram that is subsequently viewed, stored,and analyzed using the corresponding software programs. These methodsare known to those of skill in the art and have been described andreviewed (Birren et al., Genome Analysis: Analyzing DNA, 1, Cold SpringHarbor, N.Y. (1999)).

PHRED is used to call the bases from the sequence trace files(http://www.mbt.washington.edu). Phred uses Fourier methods to examinethe four base traces in the region surrounding each point in the dataset in order to predict a series of evenly spaced predicted locations.That is, it determines where the peaks would be centered if there wereno compressions, dropouts, or other factors shifting the peaks fromtheir “true” locations. Next, PHRED examines each trace to find thecenters of the actual, or observed peaks and the areas of these peaksrelative to their neighbors. The peaks are detected independently alongeach of the four traces so many peaks overlap. A dynamic programmingalgorithm is used to match the observed peaks detected in the secondstep with the predicted peak locations found in the first step.

After the base calling is completed, contaminating sequences (E. coli,BAC vector sequences >50 bases and sub-cloning vector are removed andconstraints are made for the assembler. Contigs are assembled using CAP3(Huang, et al., Genomics 46: 37-45 (1997)). A two-step re-assemblyprocess is employed to reduce sequence redundancies caused by overlapsbetween BAC clones. In the first step, BAC clones are grouped intoclusters based on overlaps between contig sequences from different BACs.These overlaps are identified by comparing each sequence in the datasetagainst every other sequences, by BLASTN. BACs containing overlapsgreater than 5,000 base pairs in length and greater than 94% in sequenceidentity are put into the same cluster. Repetitive sequences are maskedprior to this procedure to avoid false joining by repetitive elementspresent in the genome. In the second step, sequences from each BACcluster are assembled by PHRAP.longread, which is able to handle verylong sequences. A minimum match is set at 100 bp and a minimum score isset at 600 as a threshold to join input contigs into longer contigs.

Example 3 Identifying Genes within a Genomic BAC Library

This example illustrates the identification of combigenes within therice genomic contig library as assembled in Example 2. The genes andpartial genes that are embedded in such contigs are identified through aseries of informatic analyses. The tools to define genes fall into twocategories: homology-based and predictive-based methods. Homology-basedsearches (e.g., GAP2, BLASTX supplemented by NAP and TBLASTX) detectconserved sequences during comparisons of DNA sequences orhypothetically translated protein sequences to public and/or proprietaryDNA and protein databases. Existence of an Oryza sativa gene is inferredif significant sequence similarity extends over the majority of thetarget gene. Since homology-based methods may overlook genes unique toOryza sativa, for which homologous nucleic acid molecules have not yetbeen identified in databases, gene prediction programs are also used.Predictive methods employed in the definition of the Oryza sativa genesincluded the use of the GenScan gene predictive software program whichis available from Stanford University (e.g., at the website:gnomic/stanford.edu/GENSCANW.html, and the Genemark.hmm for Eukaryotesprogram from Gene Probe, Inc (Atlanta, Ga.)www.geneprobe.net/index.htm). GenScan, in general terms, infers thepresence and extent of a gene through a search for “gene-like” grammar.GeneMark.hmm searches a file containing DNA sequence data for genes. Itemploys a Hidden Markov Model algorithm with a species-specificinhomogeneous Markov model of gene-encoding regions of DNA.

The homology-based methods that are used to define the Oryza sativa geneset included GAP2, BLASTX supplemented by NAP and TBLASTX. For adescription of BLASTX and TBLASTX see Coulson, Trends in Biotechnology12:76-80 (1994) and Birren et al., Genome Analysis, 1:543-559 (1997).GAP2 and NAP are part of the Analysis and Annotation Tool (AAT) forFinding Genes in Genomic Sequences which was developed by Xiaoqiu Huangat Michigan Tech University and is available at the web sitegenome.cs.mtu.edu/. The AAT package includes two sets of programs, oneset DPS/NAP (referred to as “NAP”) for comparing the query sequence witha protein database, and the other set DDS/GAP2 (referred to as “GAP2”)for comparing the query sequence with a cDNA database. Each set containsa fast database search program and a rigorous alignment program. Thedatabase search program identifies regions of the query sequence thatare similar to a database sequence. Then the alignment programconstructs an optimal alignment for each region and the databasesequence. The alignment program also reports the coordinates of exons inthe query sequence. See Huang, et al., Genomics 46: 37-45 (1997). TheGAP2 program computes an optimal global alignment of a genomic sequenceand a cDNA sequence without penalizing terminal gaps. A long gap in thecDNA sequence is given a constant penalty. The DNA-DNA alignment by GAP2adjusts penalties to accommodate introns. The GAP2 program makes use ofsplice site consensuses in alignment computation. GAP2 delivers thealignment in linear space, so long sequences can be aligned. See Huang,Computer Applications in the Biosciences 1-235 (1994). The GAP2 programaligns the Oryza sativa contigs with a library of 42,260 Oryza sativacDNAs. The NAP program computes a global alignment of a DNA sequence anda protein sequence without penalizing terminal gaps. NAP handlesframeshifts and long introns in the DNA sequence. The program deliversthe alignment in linear space, so long sequences can be aligned. Itmakes use of splice site consensuses in alignment computation. Bothstrands of the DNA sequence are compared with the protein sequence andone of the two alignments with the larger score is reported. See Huang,and Zhang, Computer Applications in the Biosciences 12(6), 497-506(1996).

NAP takes a nucleotide sequence, translates it in three forward readingframes and three reverse complement reading frames, and then comparesthe six translations against a protein sequence database (e.g. thenon-redundant protein (i.e., nr-aa database maintained by the NationalCenter for Biotechnology Information as part of GenBank).

The first homology-based search for genes in the Oryza sativa contigs iseffected using the GAP2 program and the Oryza sativa library ofclustered Oryza sativa cDNA. The Oryza sativa clusters are mapped ontoan assembly of Oryza sativa contigs using the GAP2 program. GAP2standards for selecting a DNA-DNA match are >92% sequence identity withthe following parameters:

gap extension penalty=1

match score=2

gap open penalty=6

gap length for constant penalty=20

mismatch penalty=−2

minimum exon length=21

minimum total length of all exons in a gene (in nucleotide)=200

When a particular Oryza sativa cDNA aligns to more than one Oryza sativacontig, the alignment with the highest identity is selected andalignments with lower levels of identity are filtered out assurreptitious alignments. Oryza sativa cDNA sequences aligning to Oryzasativa contigs with exceptionally low complexity are filtered out whenthe basis for alignment included a high number of cDNAs with poly Atails aligning to genomic regions with extended repeats of A or T.

The second homology-based method used for gene discovery is BLASTX hitsextended with the NAP software package. BLASTX is run with the Oryzasativa genomic contigs as queries against the GenBank non-redundantprotein data library identified as “nr-aa”. NAP is used to better alignthe amino acid sequences as compared to the genomic sequence. NAPextends the match in regions where BLASTX has identifiedhigh-scoring-pairs (HSPs), predicts introns, and then links the exonsinto a single ORF prediction. Experience suggests that NAP tends tomis-predict the first exon. The NAP parameters are:

gap extension penalty=1

gap open penalty=15

gap length for constant penalty=25

min exon length (in aa)=7

minimum total length of all exons in a gene (in nucleotide)=200

homology >40%

The NAP alignment score and GenBank reference number for best match arereported for each contig for which there is a NAP hit.

In the final homology-based method, TBLASTX, is used with cDNAinformation from four plant sequencing projects: 27,037 sequences fromTriticum aestivum, 136,074 sequences from Glycine max, 71,822 sequencesfrom Zea mays and 68,517 sequences from Arabidopsis thaliana.Conservative standards for inclusion of TBLASTX hits into the gene setare utilized. These standards are a minimal E value of 11E-16, and aminimal match of 150 bp in Oryza sativa contig.

The GenScan program is “trained” with Arabidopsis thalianacharacteristics. Though better than the “off-the-shelf” version, theGenScan trained to identify Oryza sativa genes proved more proficient atpredicting exons than predicting full-length genes. Predictingfull-length genes is compromised by point mutations in the unfinishedcontigs, as well as by the short length of the contigs relative to thetypical length of a gene. Due to the errors found in the full-lengthgene predictions by GenScan, inclusion of GenScan-predicted genes islimited to those genes and exons whose probabilities are above aconservative probability threshold. The GenScan parameters are:

weighted mean GenScan P value >0.4

mean GenScan T value >0

mean GenScan Coding score >50

length >200 bp

minimum total length of all exons in a gene=500

The weighted mean GenScan P value is a probability for correctlypredicting ORFs or partial ORFs and is defined as the (1/SS li)(SS liPi), where “1” is the length of a exon and “P” is the probability orcorrectness for the exon.

The GeneMark.hmm for Eukaryotes program uses the Hidden Markov model forspecies Oryza Sativa. Minimum total length of all exons in a gene is 500bp. Except for the model selection, there is no specific run-timeparameter for GeneMark.hmm.

The gene predictions from these programs are stored in a database andthen combigenes are derived from these predictions. A combigene is acluster of putative genes which satisfy the following criteria:

All genes making up a single combigene are located on the same strand ofa contig;

Maximum intron size of a valid gene is 4000 bp;

Maximum distance between any two genes in the same combigene is 200 bp,as measured by the bases between adjacent ending exons;

If an individual gene is predicted by NAP it has at least 40% sequenceidentity to its hit;

If an individual gene is predicted by GAP2 it has at least 92% sequenceidentity to its hit;

If an individual gene is predicted by Genscan the weighted average ofthe probabilities calculated for all of its exons is not less than 0.4.The gene boundaries of a Genscan-predicted gene are determined whiletaking into account only exons.

Since TBLASTX-predicted genes are standless the combigene which is madeup of such genes can be assigned a strand only if there is a gene in thecluster that was predicted by a strand-defining gene-predicting program.

Example 4 Identifying Promoters in the Genomic BAC Library usingBioinformatic Techniques

Candidate promoter sequences are selected by identifying the regions ofDNA located immediately upstream of “combigenes” as described anddefined in Example 3. The length of the region to be extracted from thecorresponding contig's sequence is set to be 1500 nucleotides plus thevery first nucleotide of a combigene. Thus, if a combigene issufficiently far from the edge of a contig a 1501 nucleotide sequence isobtained, otherwise the sequence will be shorter. Only coding regionpredictions are considered when building combigenes. Therefore, the 5′UTR of the putative cDNA is included as part of the combigene upstreamregion.

If there is an AAT/NAP-predicted component in a combigene, then theputative promoter sequence is extracted upstream of the beginning ofthat component, otherwise—the sequence is extracted upstream of thebeginning of the combigene (which may correspond to Genscan, AAT/GAP ora TBLASTX prediction).

Promoter candidates are further selected using bioinformatic analysis ofthe candidate promoter sequence.

The candidate promoter regions listed in SEQ ID NO:1 through SEQ IDNO:57467 are analyzed for known promoter motifs listed in Table 2.

The identification of such motifs provides important information aboutthe candidate promoter. For example, some motifs are associated withinformative annotations such as “light inducible binding site” or“stress inducible binding motif” and can be used to select withconfidence a promoter that is able to confer light inducibility orstress inducibility to an operably-linked transgene, respectively.

Putative promoter sequences are also searched with matrices for the TATAbox, GC box (factor name: V_GC_01) and CCAAT box (factor name:F_HAP234_01). The matrix for the TATA box is from the EukaryoticPromoter Database (http://www.epd.isb-sib.ch/) and the matrices for theGC box and the CCAAT box are from Transfac.

The algorithm that is used to annotate promoters searches for matches toboth sequence motifs and matrix motifs. First, individual matches arefound. For sequence motifs, a maximum number of mismatches is allowed(see Table 2). If the code M,R,W,S,Y, or K are listed in the sequencemotif (each of which is a degenerate code for 2 nucleotides) ½ mismatchis allowed. If the code B, D, H, or V are listed in the sequence motif(each of which is a degenerate code for 3 nucleotides) ⅓ mismatch isallowed. p values are determined by simulation with a 5 Mb of random DNAwith the same dinucleotide frequency as the test set is generated andthe probability of a given matrix score is determined (number ofhits/5e7). Once the individual hits have been found, the putativepromoter sequence is searched for clusters of hits in a 250 bp window.The score for a cluster is found by summing the negative natural log ofthe p value for each individual hit. Using 100 Mb simulations asdescribed above, the probability of a window having a cluster scoregreater than or equal to the given value is determined. Clusters with ap value more significant than p<1e-6 are reported. Only the top 287 hitsare taken and are ranked by p value. Effects of repetitive elements arescreened. If the 287th ranked hit has the same p value as the firstranked hit, no results are reported for that factor.

For matrix motifs, a p value cutoff is used on a matrix score. Thematrix score is determined by adding the path of a given DNA sequencethrough a matrix. P values are determined by simulation; 5 Mb of randomDNA with the same dinucleotide frequency as a test set is generated totest individual matrix hits and 100 Mb is used to test clusters; theprobability of a given matrix score and the probability scores forclusters are determined as are the sequence motifs. The usual cutoff formatrices is 2.5e-4. No clustering is done for the TATA box, GC box orCCAAT box.

Candidate promoters are also selected based on the expressioncharacteristics of the gene that is cis-associated with the candidatepromoter, (i.e. the native gene). For example, a promoter region located5′ to a gene, which is expressed during a specific stage of development,likely plays a key role in the temporal regulation of that gene. Thusthe promoter, when operably linked to a heterologous coding sequence,may similarly regulate the heterologous coding sequence.

Combining the motif analysis with the expression analysis, the list ofcandidate promoters having desired properties can be narrowed. Thisdecreases the overall number of candidate promoters that must bescreened to confirm the promoter's function. For example, one can startwith seed-expressed transcription factors, identify candidate promotersthat match the consensus regulation sites for seed-expressedtranscription factors, and then test the identified candidate promotersto confirm the promoter sub-set which are capable of conferringseed-specific expression to a gene.

Example 5 Identifying Promoters in the Genomic BAC Library using anExpression Assay

Promoters may also be identified based on quantitative analysis of genesthat are cis-associated with candidate promoters, (i.e. the nativegenes). In this method, the native genes associated with SEQ ID NO: 1through SEQ ID NO: 18 are analyzed on a digital northern blot. Digitalnorthern data can be generated from EST sequencing, SAGE and othermethods, which in effect count RNA molecules expressed in cell. Thisdata can be generated as needed, or is generally available to the publicon a number of web sites (e.g., www.tigr.org). Data can be obtained fromany plant species, although data on rice gene expression is particularlypreferred. Promoters are selected based on the expression information ofthe digital northern. For example, identifying genes expressing genesunder stress-related conditions would provide a group of promoters ableto confer such stress-inducible expression to other genes.

Example 6 Identifying Promoters in the Genomic BAC Library usingMicroarray Analysis

Promoters may also be selected based by transcriptional profiling ormicroarray analysis. Transcriptional profiling can be completed on largescale for each cis-linked gene associated with SEQ ID NO: 1 through SEQID NO: 18. Transcription profiling data can be obtained on RNA preparedfrom any plant species using a chip comprised of sequences from anyplant species, although data generated from rice using a rice chip ispreferred.

A comprehensive database of transcription profiling data narrows downthe list of promoter candidates that confer a desired expressionpattern. For example, a promoter that confers drought-specificexpression can be selected by identifying a cis-linked gene that isinduced under drought conditions (on the microarray), but is notexpressed at other stages of plant growth and development. Such apromoter is likely to confer drought inducibility to an operably linkedtransgene. Public databases of transcript profiling data are becomingmore comprehensive and thereby enabling this type of analysis.

Example 7 Functional Screening of Promoters in an Expression Assay

Promoters are screened in an expression assay. The promoters in SEQ IDNO: 1 through SEQ ID NO: 18 are amplified by PCR from rice genomic DNAand cloned into an expression vector containing a reporter transgene(e.g., GUS or GFP). The individual promoter or a collection of promoters(“promoter library”) are then screened in an expression assay for theability to express the reporter transgene. In a common expression assayfor leaf promoters, the promoters are transfected into rice or maizeleaf protoplasts. Reporter gene expression in the protoplasts indicatesa promoter capable of conferring gene-expression in the leaf. Thepromoters are also transfected into protoplasts from other tissues orplant species to identify other regulatory features of the promoter.

Alternatively, promoters may be screened using a particle gun techniqueto bombard the cells, tissues or plants. The bombarded samples arevisually inspected for reporter gene expression. Reporter geneexpression observed in any bombarded samples indicates the presence of apromoter able to confer expression of a transgene in that cell, tissueor plant.

The promoters may also be screened in plants where transformationprotocols have been greatly enhanced to facilitate the screening oflarge numbers of promoters. In this approach, the individual ricepromoters or “promoter library” is transformed into Arabidopsis plants.The resulting transformed tissues or progeny are scored for reporterexpression. Again, reporter gene expression in a given tissue indicatesthat a promoter is able to confer transgene expression in that tissue.

For some promoters, such as those providing constitutive expression, areporter transgene can be replaced with a selectable marker transgene,such as a gene conferring glyphosate tolerance. Transformed cells,tissues or plants expressing the selectable marker are selected, ratherthan visually scored. For example, the promoter is linked to aselectable marker, such as glyphosate resistance, and then screening formale sterile plants. The selected plants, in this case male sterileplants, may contain a promoter for male reproductive tissues.

The promoters described herein can also be used to ablate or kill cellsexpressing a gene from the promoter. In such cases, the promoter isoperably linked to a negative selectable marker gene, including but notlimited to the diptheria toxin gene, or to a conditional lethal gene,including but not limited to the phosphonate ester hydrolase gene(pehA). The negative selectable marker gene is transformed into cells,tissues or plants. The cells, tissues or plants which express thenegative selectable gene from the promoter are selectively killed. Inthe case of the conditional lethal gene, the transformed cells, tissuesor plants which express the conditional lethal gene are only killed inthe presence of the negative selective agent or negative selectivecondition. In the example of the phosphonate ester hydrolase gene, thetransformed cells, tissues or plants which express the conditionallethal gene are only killed in the presence of glyceryl glyphoste.

Example 8 Identification and Cloning of Regulatory Elements

Regulatory elements are isolated from Oryza sativa genomic DNA. Allregulatory elements are sub-cloned into a plant transformation vectoroperably linking the regulatory elements to the Zea mays HSP70 intron(1-Zm.DnaK-1:1:1, described in U.S. Pat. No. 5,424,412, which isincorporated herein by reference), the coding region for β-glucuronidase(GUS described in U.S. Pat. No. 5,599,670, which is incorporated hereinby reference), and the Agrobacterium tumefaciens NOS gene terminator.

Variants of the rice Metallothionein (MTH) gene's regulatory elementsmay be isolated from Oryza sativa genomic DNA using sequence specificprimers and PCR amplification methods.

The present invention thus provides isolated polynucleotide moleculeshaving gene regulatory activity (regulatory elements) and DNA constructscomprising the isolated regulatory elements operably linked to atranscribable polynucleotide molecule.

Example 9 Corn Plant Transformation and GUS analysis

Corn plants are transformed with plant expression constructs forhistochemical GUS analysis in plants. Plants are transformed usingmethods known to those skilled in the art. Particle bombardment of cornH99 immature zygotic embryos may be used to produce transgenic maizeplants. Ears of maize H99 plants are collected 10-13 days afterpollination from greenhouse grown plants and sterilized. Immaturezygotic embryos of 1.2-1.5 mm are excised from the ear and incubated at28° C. in the dark for 3-5 days before use as target tissue forbombardment. DNA comprising an isolated expression cassette containingthe selectable marker for kanamycin resistance (NPTII gene) and thescreenable marker for β-D-Glucuronidase (GUS gene) is gel purified andused to coat 0.6 micron gold particles (Catalog #165-2262 Bio-Rad,Hercules, Calif.) for bombardment. Macro-carriers are loaded with theDNA-coated gold particles (Catalog #165-2335 Bio-Rad, Hercules Calif.).The embryos are transferred onto osmotic medium scutellum side up. APDS1000/He biolistic gun is used for transformation (Catalog #165-2257Bio-Rad, Hercules Calif.). Bombarded immature embryos are cultured andtransgenic calli are selected and transferred to tissue formationmedium. Transgenic corn plants are regenerated from the transgenic calliand transferred to the greenhouse.

GUS activity is qualitatively and quantitatively measured using methodsknown to those skilled in the art. Plant tissue samples are collectedfrom the same tissue for both the qualitative and quantitative assays.For qualitative analysis, whole tissue sections are incubated with theGUS staining solution X-Gluc (5-bromo-4-chloro-3-indolyl-β-glucuronide)(1 milligram/milliliter) for an appropriate length of time, rinsed, andvisually inspected for blue coloration. For quantitative analysis, totalprotein is first extracted from each tissue sample. One microgram oftotal protein is used with the fluorogenic substrate4-methyleumbelliferyl -β-D-glucuronide (MUG) in a total reaction volumeof 50 μl (microliters). The reaction product 4-methylumbelliferone(4-MU) is maximally fluorescent at high pH. Addition of a basic solutionof sodium carbonate simultaneously stops the assay and adjusts the pHfor quantifying the fluorescent product. Fluorescence is measured withexcitation at 365 nm, emission at 445 nm using a Fluoromax-3 withMicromax Reader, with slit width set at excitation 2 nm and emission 3nm. The GUS activity is expressed as pmole of 4-MU/micrograms ofprotein/hour (pMole of 4-MU/μg protein/hour).

Example 10 MTH Regulatory Element Analysis in Stable Transgenic CornPlants

Corn plants representing nine F1 events (plants representing anindependent event produced from R0 transgenic plants crossed withnon-transgenic H99 plants) transformed with pMON94302 (comprising SEQ IDNO: 16) were analyzed for GUS activity as described above. Corn plantsrepresenting ten F1 events (plants representing an independent eventproduced from R0 transgenic plants crossed with non-transgenic H99plants) transformed with pMON84008 (comprising SEQ ID NO:11) wereanalyzed for GUS activity as described above. Mean levels of GUSactivity (pMole of 4-MU/μg protein/hour) for each stage of plantdevelopment and organ tested are provided as mean GUS activity+/−standard error (SE) measurements. Abbreviations include: nonedetected by visible detection methods (ND), three leaf stage (V3), sevenleaf stage (V7), tasseling stage (VT), days after germination (DAG), anddays after pollination (DAP) are used. Mean levels of GUS activity(pMole of MU/μg protein/hour) for each stage of plant development andorgan tested are provided in Table 2 and Table 3 below. Specific celltypes for which GUS expression was noted are provided in Table 3.

TABLE 2 Os.MTH Regulatory Element Expression in Transgenic Corn PlantTissues pMON94302 P-Os.Metallothionein-a-1:1:7 Stages Organ InducerRange Mean ± SE Imbibed Embryo — 4.18-4.18 4.18 ± 0.00 seed ImbibedEndosperm — 2.72-2.72 2.72 ± 0.00 seed 3 DAG Root — 126.00-286.79 220.91± 48.63  V3 Root main Unstress 104.68-237.08 148.73 ± 29.91  V3 Rootcrown —  42.25 257.78 171.88 ± 48.47  V7 Root seminal —  11.46-819.32265.37 ± 120.91 V7 Root crown —  16.93 993.98 294.39 ± 136.87 VT Rootseminal —  8.15-15.04 11.59 ± 3.45  VT Root crown — 149.41 180.08 164.75± 15.33  3 DAG Coleoptile — 138.83-375.19 264.31 ± 37.23  V3 LeafUnstress  381.95-1116.41 730.12 ± 185.74 V7 Leaf-Mature — 48.67-71.7761.46 ± 4.88  VT Internode —  89.68 473.01 205.88 ± 69.93  VT Cob — 23.92-223.94 99.01 ± 44.80 VT Anther — 20.48-40.97 32.78 ± 4.37  VTPollen — <0.1 <0.1 <0.1 ± 0.00 VT Silk — 15.77 35.03 23.33 ± 5.93  21DAP Embryo — <0.1-<0.1 <0.1 ± 0.00 35 DAP Embryo —  5.57-13.22 9.39 ±3.83 10 DAP Kernal —  18.73-259.87 105.57 ± 12.71  21 DAP Endosperm — 11.28-140.04 72.86 ± 12.97 35 DAP Endosperm — 25.90-53.71 37.62 ± 3.44 Range lowest and highest activity of individual seedlings across events;Mean/SE overall mean across all the events DAG Days After Germination;DAP Days After Pollination; Em Embryo; En Endosperm; VT Tasseling stage;IS Imbibed seed; C coleoptile; R Root; L Leaf; V3 three leaf stage;V7 Seven leaf stage; nd not determined

TABLE 3 Os.MTH Regulatory Element Expression in Transgenic Corn PlantTissues pMON84008 P-Os.Metallothionein-b-1:1:2 Stages Organ InducerRange Mean ± SE Imbibed Embryo — 17.05-17.05 17.05 ± 0.00  seed ImbibedEndosperm — <0.1-<0.1 <0.1 ± 0.00 seed 3 DAG Root —  10.32-836.89 330.82± 256.01 V3 Root main Unstress <0.1-<0.1 <0.1 ± 0.00 V3 Root crownUnstress <0.1 <0.1 <0.1 ± 0.00 V3 Root main Cold 2.27-2.27 2.27 ± 0.00V3 Root crown Cold <0.1 <0.1 <0.1 ± 0.00 V3 Root main Desiccation<0.1-<0.1 <0.1 ± 0.00 V3 Root crown Desiccation nd nd   nd ± 0.00 V7Root seminal — <0.1-<0.1 <0.1 ± 0.00 V7 Root crown — <0.1 <0.1 <0.1 ±0.00 VT Root seminal — 46.78-46.78 46.78 ± 0.00  VT Root crown — 20.9720.97 20.97 ± 0.00  3 DAG Coleoptile —  11.78-757.74 199.07 ± 114.28 V3Leaf Unstress <0.1-<0.1 <0.1 ± 0.00 V3 Leaf Cold 15.60-15.60 15.60 ±0.00  V3 Leaf Desiccation 25.57-25.57 25.57 ± 0.00  V7 Leaf-Mature —<0.1-0.00 <0.1 ± 0.00 V7 Leaf-Young — 1.09-1.09 1.09 ± 0.00 VTLeaf-Mature — <0.1-<0.1 <0.1 ± 0.00 VT Leaf- — <0.1-<0.1 <0.1 ± 0.00Senescence VT Internode — 25.86 48.10 36.98 ± 11.12 VT Cob — 11.29-32.2921.79 ± 10.50 VT Anther — 53.57-53.57 53.57 ± 0.00  VT Pollen —  18.51584.48 207.97 ± 80.44  VT Silk —  4.91 18.28 10.18 ± 1.28  14 DAP Embryo—  62.40-165.15 113.77 ± 51.37  21 DAP Embryo — <0.1-<0.1 <0.1 ± 0.00 35DAP Embryo — 0.43-2.05 1.10 ± 0.49 7 DAP Kernal — <0.1-<0.1 <0.1 ± 0.0014 DAP Endosperm — <0.1-<0.1 <0.1 ± 0.00 21 DAP Endosperm — <0.1-<0.1<0.1 ± 0.00 35 DAP Endosperm — 0.43-0.43 0.43 ± 0.00 Range—lowest andhighest activity of individual seedlings across events; Mean/SE—overallmean across all the events DAG—Days After Germination; DAP—Days AfterPollination; Em—Embryo; En—Endosperm; VT—Tasseling stage; IS—Imbibedseed; C—coleoptile; R—Root; L—Leaf; V3—three leaf stage; V7—Seven leafstage; nd—not determinedThe Os.MTH expression elements have thus been shown to be useful inexpressing transgenes in the cell types and developmental stages asshown above. Having taught the isolation, identification, transformationand expression analysis of two rice Metallothionein gene expressionregulatory elements, it is within the ordinary skill of the art to applythe same principles for testing of other such elements.

The present invention thus provides DNA constructs comprising regulatoryelements that can modulate expression of an operably linkedtranscribable polynucleotide molecule and a transgenic plant stablytransformed with the DNA construct. From the examples given, the presentinvention thus provides isolated regulatory elements and isolatedpromoter fragments from Oryza sativa, particularly Metallothionein generegulatory elements, that are useful for modulating the expression of anoperably linked transcribable polynucleotide molecule. The presentinvention also provides a method for assembling DNA constructscomprising the isolated regulatory elements and isolated promoterfragments, and for creating a transgenic plant stably transformed withthe DNA construct.

Having illustrated and described the principles of the presentinvention, it should be apparent to persons skilled in the art that theinvention can be modified in arrangement and detail without departingfrom such principles. We claim all modifications that are within thespirit and scope of the appended claims. All publications and publishedpatent documents cited in this specification are incorporated herein byreference to the same extent as if each individual publication or patentapplication is specifically and individually indicated to beincorporated by reference.

We claim:
 1. A polynucleotide construct comprising a regulatorypolynucleotide molecule selected from the group consisting of: (a) thenucleic acid sequence of SEQ ID NO:16; (b) a fragment of SEQ ID NO:16with promoter activity; and (c) a nucleic acid sequence that exhibits a99% or greater sequence identity to SEQ ID NO:16 and has promoteractivity, wherein said regulatory polynucleotide molecule is operablylinked to a heterologous transcribable polynucleotide molecule.
 2. Thepolynucleotide construct of claim 1, wherein said transcribablepolynucleotide molecule is a gene of agronomic interest.
 3. Thepolynucleotide construct of claim 1, wherein said transcribablepolynucleotide molecule is a gene controlling the phenotype of a traitselected from the group consisting of: herbicide tolerance, insectcontrol, modified yield, fungal disease resistance, virus resistance,nematode resistance, bacterial disease resistance, plant growth anddevelopment, starch production, modified oils production, high oilproduction, modified fatty acid content, high protein production, fruitripening, enhanced animal and human nutrition, biopolymers,environmental stress resistance, pharmaceutical peptides and secretablepeptides, improved processing traits, improved digestibility, enzymeproduction, flavor, nitrogen fixation, hybrid seed production, fiberproduction, and biofuel production.
 4. The polynucleotide construct ofclaim 3, wherein said herbicide tolerance gene is selected from thegroup consisting of genes that encode for: phosphinothricinacetyltransferase, glyphosate resistant EPSPS, hydroxyphenyl pyruvatedehydrogenase, dalapon dehalogenase, bromoxynil resistant nitrilase,anthranilate synthase, glyphosate oxidoreductase and glyphosate-N-acetyltransferase.
 5. A transgenic plant cell stably transformed with thepolynucleotide construct of claim
 1. 6. A transgenic plant stablytransformed with the polynucleotide construct of claim
 1. 7. A seed ofsaid transgenic plant of claim 6, wherein the seed comprises theconstruct.
 8. A progeny of the plant of claim 6, wherein the progenycomprises said construct.
 9. The transgenic plant cell of claim 5,wherein said plant cell is from a monocotyledonous plant selected fromthe group consisting of wheat, maize, rye, rice, corn, oat, barley,turfgrass, sorghum, millet and sugarcane.
 10. The transgenic plant ofclaim 6, wherein said plant is a monocotyledonous plant selected fromthe group consisting of wheat, maize, rye, rice, corn, oat, barley,turfgrass, sorghum, millet and sugarcane.
 11. A seed of the transgenicplant of claim 10, wherein the seed comprises the construct.
 12. Thetransgenic plant cell of claim 5, wherein said plant cell is from adicotyledonous plant selected from the group consisting of tobacco,tomato, potato, soybean, cotton, canola, sunflower and alfalfa.
 13. Thetransgenic plant of claim 6, wherein said plant is a dicotyledonousplant selected from the group consisting of tobacco, tomato, potato,soybean, cotton, canola, sunflower and alfalfa.
 14. A seed of thetransgenic plant of claim 13, wherein the seed comprises the construct.15. A method of inhibiting weed growth in a field of transgenicglyphosate-tolerant crop plants comprising a) planting the transgenicglyphosate-tolerant crop plants in the field, wherein the plants aretransformed with an expression cassette comprising the polynucleotideconstruct of claim 1 operably linked to a polynucleotide moleculeencoding a glyphosate tolerance gene; and b) applying glyphosate to thefield at an application rate that inhibits the growth of weeds, whereinthe growth and yield of the transgenic crop plant is not substantiallyaffected by the glyphosate application.